Page MenuHomeVyOS Platform

webproxy: migrate 'service webproxy' to get_config_dict()
Closed, ResolvedPublicFEATURE REQUEST

Description

Squid proxy is old and inefficient
while it done great job in the past, it's now really meets current needs

My proposal is to move to Apache Traffic Server

Lot of benefits:

  • no squid
  • better performance
  • better flexibility

see https://www.slideshare.net/bryan_call/choosing-a-proxy-server-apachecon-2014

Details

Difficulty level
Hard (possibly days)
Version
-
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

syncer created this task.

Notes:

  • trafficserver (buster native - 19.7 MB of additional disk space will be used)
  • looks like squidguard can't be integrated (removing it entirely?)

cli options:

  • append-domain (ts uses the local domain, can't be configured anymore)
  • transparent mode won't be the default anymore, since it requires a few iptables rules
  • default-port will be removed
  • domain-block, domain-noncache will be removed, doesn't exist in TS
  • reply-block-mime, reply-body-max-size will be removed as it doesn't exist in TS
  • url-filtering, whitelist will be removed as it doesn't exist in TS (whitelist defines a list trusted proxy IPs instead)

Input appreciated.

hagbard raised the priority of this task from Low to High.Jan 21 2020, 9:08 PM
hagbard changed Difficulty level from Unknown (require assessment) to Normal (likely a few hours).
hagbard set Is it a breaking change? to Unspecified (possibly destroys the router).
hagbard changed Difficulty level from Normal (likely a few hours) to Hard (possibly days).Jan 21 2020, 9:28 PM
hagbard changed the task status from Open to On hold.Jan 23 2020, 4:51 PM

@syncer After all considerations, because of the authentication modules squid brings in, I would rather stay with squid for now. Let me know what you think.

hagbard changed the task status from On hold to Confirmed.Jan 26 2020, 10:54 PM

All right, we stay with squid, however I may drop squidguard but ask in the forum first if that feature would be required by many users.

hagbard lowered the priority of this task from High to Normal.Jan 26 2020, 10:59 PM
hagbard renamed this task from Migrate web proxy from squid to apache traffic server to Migrate 'service webproxy' to python/xml.Feb 11 2020, 11:16 PM
hagbard changed the task status from Confirmed to In progress.
hagbard moved this task from Need Triage to In Progress on the VyOS 1.3 Equuleus board.

Squid will be used for authentication and controlling name resolution (pointing to a spacial DNS or so?) , no squidguard or caching will be used anymore. It also ran in transparent mode per default, which requires an iptables rules set. I think that feature can be removed, since a transparent proxy has no authentication options anyway.

Please use the new get_config_dict() API calls.

In T563#74302, @c-po wrote:

Please use the new get_config_dict() API calls.

Yeah, doing that already. Also the airbag function. Lot's has changed since I had a chance to do anything for vyos.

Is there any interest in the following scenarios:

  1. squid can use different dns instead of using the ones from resolv.conf (including DNS fine tuning like dnsv4_first, dns_timeout etc)
  2. NTLM authentication ( I could need some testers for testing with AD for that)
  3. URL rewriting (currently there you can block domains, which could be done with a prepared DNS and option 1. too)

I've previously mentioned light blocking (domain level, gTLD level), but with the increasing amount of DoH, having a means to kill off DoH and force special DNS processing, including offload to a separate DNS server (managed by a security appliance for example, say PiHole or similar) would be valuable.

I agree, a separate DNS would be way easier to maintain if you have a lot of TLDs you need/want to block, since squid has to load it from a list, let's see if anyone is still using that, other wise it would be nicer and easier to scrape that off and implement a nameserver tag node win the cli.

Does anyone know if ldap auth worked at all with the old perl backend? I try to find out how likely I need to migrate cli entries. from what I have seen, ldap auth with anonymous ldap browsing didn't generate any required config for squid.

Large enterprises usually use LDAP/AD to authenticate and log its users web browsing, so this should be added. Anonymous binding is kinda old fashioned so maybe it was a bug.

The perl scripts didn't create any config line, that's why I'm asking. I have it already implemented and successfully tested with the new python code, but wonder how people were able to use it all by just using the cli. I may need somebody for testing with AD, since I don't have access to any AD environment anymore.

Due to the fact that transparent proxy, which was the default, is being removed for now, there will be in the first version 2 authentication modes, one is by IP address or network (nothing else would be required as long as you have the correct src IP) and LDAP (either anonym or with bind-dn to browse LDAP. I have both mechanisms already working via cli and about to clean up and test right now. If anyone need a special authentication mechanism, please let me know. I also disabled local file caches, since these days most traffic is https anyway, we can take some pressure off of the filesystem (ssd).

ATS looks nice.

Radius, SAML, Oauth? I think the last two are probably easily possible withins apache structure (havent looked too much in docs ./sleepy).

When I worked in captive portalling we were not aware of ATS existence. At the time we custom build a system using middleware PHP to trap users into a sandbox. Upon authentication/authorization/payment we would let them roam the internet (or some private network/asset). We had used at that point mod_proxy, squid and an derivative frankenchild of chillispot.

We were migrating towards a promising POC solution based on basic linux constructs (this was around centos7). We would iptable users into C-based redirector and forward to apache; have them it do customer magic, kick em off on the internets and route. Intercept and kill the mac/ip-based session when commerce dictates. A cronjob would generate RADIUS accounting data based on iptables counting. Todo content-filtering we would work on dynamically updated provisioned pre-canned categories and dnsmasq with dynamic updates for iptables lists.

hagbard changed the task status from In progress to On hold.Nov 14 2020, 3:15 PM
c-po changed the task status from On hold to In progress.Dec 10 2020, 7:46 PM
c-po claimed this task.