Today we are going to talk about Shadowsocks, how it was originated? why it becomes more popular day by day? what’s the difference between Shadowsocks and other technologies like VPN and HTTPS, or SSH? what it’s good for? Will it be going a long way in helping people circumvent the censorship and encrypt their web traffic?
1. How Shadowsocks get noticed?
The crackdown of Shadowsocks project
Shadowsocks project remained largely unknown, until the autumn of 2015, around the time of some national celebrating event of China, a developer named Clowwindy posted a message on Github announcing the police knocked on his door, mandated him to stop working on Shadowsocks, and he had no choice but to obey the order. Around the same time, Apple store removed dozens of VPN apps from its app store for Greater China. News are also circling around that the Chinese government is cracking down all VPN service providers, almost overnight, many working VPNs stopped working technically. From these news people read out 2 things:
One, China’s Great Firewall was not able to block Shadowsocks technically, therefore they had to trace the developer physically and ask him to stop the work.
Two, other VPN’s are easy to block, so they did it, they don’t bother to physically visit the VPN providers since most of them are beyond their administrative reach (overseas) anyway.
That was interesting, I thought, so I decided to take a look at the Github page of the developer, I’m glad I did, and I have some interesting takeaway from the visit.
Clowwindy posted a comment on HTTPS vs socks5 proxy, because at that time, some people use HTTPS as a proxy to wrap the internal layer of HTTP or HTTPS packets. This HTTPS inside HTTPS approach works quite successfully, but Clowwindy commented that it is vulnerable to GFW’s attack. GFW can easily filter out the plain text part of the HTTPS certificate, such as your company name, or your domain name, therefore to tear down the entire HTTPS connection. If you re-generate another certificate of your own, GFW can block that new certificate (the plain text part), and the browser will show a security risk warning that your cert is self issued (without proper authentication).
China’s Great Firewall (GFW)
For those who don’t know yet, China has implemented a cyber wall dubbed Great Firewall around its border, to censor the information the government doesn’t like. This is done by a few major simple yet powerful measures. First, GFW implement a special version of DNS across the national ISP, so domain names such as Facebook.com will be routed to a fake, invalid address instead of its real address, when users request those domains on the browser, they will be directed to a dead end. Second, GFW has a gigantic plain text filter that filter out all “bad” keywords or keyword phrases on the domain level as well as on the site content level. If people request a web page that contains the “bad” word phrases, the request will be denied with a bad request “404 or 501 page error, etc.”. Third, GFW can block IP address they don’t like.
GFW’s weakness lies in its disability of filtering cipher text, cipher text looks random enough that GFW have a hard time blocking it. If you block all the strings you see in a cipher text, you’d successfully block the all the text strings, meaning you’d kill the whole web. Having said it, GFW evolve to be more powerful that can recognize some plain text data pattern, such as the tunnel setup formalities of most VPN protocols. That means, it’d be disastrous if they block the cipher text (therefore they can’t), but they can block any specific plain text pattern before the cipher channel is established, therefore killing the cipher channel – VPN, HTTPS, SSH alike.
So, when Clowwindy claimed HTTPS isn’t an ideal proxy solution for GFW, and he’s set out to seek a better variant based on socks5, and called it Shadowsocks, he get everybody’s attention.
2. The problem of Virtual Private Networks (VPN) under GFW
The problem of VPN are three folds, from the prospective of GFW.
First, its protocol pattern is very specific and recognizable. Although GFW can’t read the cipher text once the VPN is established, it can recognize it is being VPN protocol during the channel setup, therefore block it.
Second, some VPN variants are proprietary code library, such as PPTP by Microsoft, that means you can’t audit whether it is safe or not, it many have security holes that GFW know but you don’t.
Third, VPN was not designed for GFW, some of its properties are vulnerable to GFW’s attack. For example, VPN encrypt and forward your data packets, but it doesn’t proxy your DNS request. If you are on VPN but your URL request is still via your local ISP’s DNS, you will still expose yourself and be redirected to a dead route.
VPN was the go to solution for bypassing GFW’s blockage, but as GFW evolves, most regular VPN don’t work any more. It appears that VPN’s inability happens in countries where GFW’s technology was deployed, such as, Iran, UAE. For most other countries or ISP, VPN is still a viable solution to bypass internet censorship.
3. The problem https proxy under GFW
HTTPS is the web’s default security protocol, it can’t be blocked at the protocol level, because if you do, you are essentially blocking the whole web.
Having said that, HTTPS have a few problems that makes GFW easy to block the targeted HTTPS traffic.
First, HTTPS certificate contains some plain text strings such as your domain name (xya.com) and your organization name (xyz company) etc. GFW can block such plain text string, therefore effectively block you. The HTTPS can generate arbitrary domain name or organization name on the fly, making it hard to block any specific strings, but problem is, these self-generated certificate won’t get pass the browser’s security warning.
Second, the latest machine learning algorithm that GFW employed make it possible to detect the HTTPS inside HTTPS traffic by analyzing the packet size of the stream. See this post for more details.
4. What is Shadowsocks
Shadowsocks is specifically designed to overcome GFW’s censorship, bypass its blockage. It attempts (quite successfully) to hide any protocol pattern so GFW can’t recognize the fact that you are using it. It is to render the GFW censoring machine unbearable load to detect it.
Unlike an ssh, shadowsocks use multiple tcp connections. This makes it much faster, and high performance, especially when you’re launching 100 browser instances. It also offers a UDP mode so it can be resistant to Great Firewall attacks like tcp rst packet injection.
Shadowsocks is based on socks5 protocol, it can proxy both TCP traffic and UDP traffic. For both traffic proxy, it demonstrate much less pattern of communication, as it is essentially a network agency protocol set with an access password. The connecting method is as follows:
Client connects and sends a greeting, which dictate a password authentication method.
Server responds with success/failure message.
Client sends a connection request.
We can view shadowsocks as a simplified, password authenticated, multiple ciphers socks5 proxy.
4.2 Password authentication
Shadowsocks use password authentication among the available methods below:
0x00: No authentication
0x03–0x7F: methods assigned by IANA
0x80–0xFE: methods reserved for private use
It is fast and easy, and the most popular. If the client provide a correct password, the connection will be accepted, and the socks channel will be established.
4.3 Multiple ciphers
A wonderful feature of shadowsocks is its implementation of many ciphers for users to select from. Not only because they encrypt users’ data stream, but also it provides pattern diversity. These ciphers are different, when each is used by a group of users, the encrypted data string demonstrate no specific pattern, which makes GFW hard to recognize and block.
At the time of writing, 2018-10-2, shadowsocks support 16 ciphers as below:
Pick any of these will be fine, some people say rc4-md5 is weak, which is true. But even rc4-md5 will give GFW hard time to decrypt.
4.4 Remote DNS
Shadowsocks’s underlying protocol, socks5, supports UDP, UDP is used for DNS lookup. You can set Shadowsocks to use a remote DNS server, such as the DNS on the remote shadowsocks server. This way you bypass the local DNS which give you the poisoned lookup results.
4.5 Selective proxy
Selective proxy is often called PAC – proxy auto config, you can compile a PAC file which dictates Shadowsocks which URL you want it to proxy, which not. This means you can intelligently switch proxy according to the rules added in PAC, thus achieves both speed and efficiency. If 60% of your web browsing is to local websites and 40% is to overseas websites, your selective proxy strategy will give you the most pleasing browsing experience.
4.6 HTTPS obfuscation
Shadowsocks community has developed a HTTP/HTTPS obfuscating plugin that wrap the shadowsocks package like the standard HTTPS packets, making it nearly impossible for GFW to detect and block, especially when it use port 443 (the standard HTTPS port) for proxy traffic. Such obfuscation blur the pattern between standard HTTPS and socks traffic, thus make GFW’s detection workload unbearably high.
4.6 OS support, even on router
Or cross platform, people say.
Shadowsocks server is mainly on Linux server variants, Cent OS, Ubuntu, Gentoo, etc, since Linux is the backbone of internet infrastructure.
Shadowsocks client is available on all major operating systems, desktop and mobile, computer and router, see below:
Windows 7, 8, 0
Linux: Ubuntu, Cent OS
Mac OS X
openWRT for routers
Refer to the Shadowsocks site for all client software.
4.7 High performance
According to the official website, “Shadowsocks use bleeding edge techniques using asynchronous I/O and event-driven programming”. From service provider’s point of view, a single shadowsocks server can host thousands of users proxy simultaneously. It is quite amazing.
4.8 Open source, community driven
Open source is powerful, you have a community of developers maintaining and developing the code, everybody can audit it and make improvements, suggestions, point out security vulnerabilities. People can develop new versions for a new type of device or operating system. Having an active open source community first means Shadowsocks will be long lived, probably over live the life span of its rival GFW, as long as people feel the pain of blockage, they have the motive to make Shadowsocks better.
Open source also makes Shadowsocks more solid and secure as people can review the code, check up the crypto implementations, make sure there is no mistakes, no leaks, etc. It is far better than any proprietary development team can do.
5. Shadowsocks server varieties
Shadowsocks servers have 2 major implementations, Shadwosocks and shadowsocks-libev, we will get into these details later.
6. Shadowsocks client & configurations
Shadowsocks clients are working on all the OS below, we’ve provided a comprehensive configuration guide for each of them.
7. GFW’s packet detection, DPI, machine learning
We will cover it later.
8. Browser fingerprinting
We will cover it later.