Code.RogerHub » security

Signing your own wildcard SSL/HTTPS certificates

Roger Chen — Tue, 06 Aug 2013 00:55:22 +0000

There has been a lot of concern about online privacy in the past few weeks, and lots of people people are looking for ways to better protect themselves on the Internet. One thing you can do is to create your own HTTPS/SSL Certificate Authority. I have a bunch of websites on RogerHub that I want to protect, but I am the only person who needs HTTPS access, since I manage all of my own websites. So, I’ve been using a self-signed wildcard certificate that’s built into my web browser to access my websites securely. You can do this too with a few simple steps:

First, you will need to generate a cryptographic private key for your certificate authority (CA):

$ openssl genrsa -out rootCA.key 2048

Certificate authorities in HTTPS have a private key and a matching public CA certificate. You should store this private key in encrypted storage, because you will need it again if you ever want to generate more certificates. Next, you will need to create a public CA certificate and provide some information about your new CA. If you are the sole user of your new CA, then this information can be set to whatever you want:

$ openssl req -x509 -new -nodes -key rootCA.key -days 9999 -out rootCA.pem

Let me explain a bit about these two commands. The openssl command line utility takes a subcommand as its first argument. The subcommands genrsa and req are used to create RSA keys and to manipulate certificate signing requests, respectively. Normally when you go out and purchase an expensive SSL certificate, you will generate a key yourself and create a Certificate Signing Request (CSR), which is given to your certificate authority to approve. The CA then sends back an public SSL certificate that is presented to website clients. Generating the key yourself means that the private key never passes through the hands of your CA, so only you have the ability to authenticate using your SSL certificate.

The x509 switch in the argument above signifies that you are trying to create a new Certificate Authority, not a Certificate Signing Request. The nodes switch actually means no DES, which means that the resulting certificate will not be encrypted. In this case, DES encryption of the certificate is not necessary if you are the only party involved.

You have just created a new Certificate Authority key and certificate pair. At this point, you can import the rootCA.pem certificate into your web browser and instruct it to trust it for identifying websites. From then on, your browser will accept any website certificate that is signed by your new CA. Now you’re ready to make some website certificates. Create a private key and Certificate Signing Request for your new website as follows:

$ openssl genrsa -out SITE.key 2048

....

$ openssl req -new -key SITE.key -out SITE.csr

....

> Common Name ... : (put the name of your domain here)

Remember to put the domain name of your website as the common name when it asks you for it. Once you’ve completed the certificate signing request, you need to use your new Certificate Authority to issue a certificate for this new key:

$ openssl x509 -req -days 9999 -in SITE.csr -CA rootCA.pem -CAkey rootCA.key -CAcreateserial -out SITE.crt

You’ve created a functional SSL certificate and keypair for your website! You can configure your web server to use your new site key and certificate, and it will begin serving resources over HTTPS. However, this certificate will only work for the domain that you specified before in your CSR. If you want to create a single certificate that will work for multiple domains (called a wildcard certificate or a multi-domain certificate), you will need some more steps. Create a file named SITE.cnf, and put the following inside:

[req_distinguished_name]
countryName = Country Name (2 letter code)
stateOrProvinceName = State or Province Name (full name)
localityName = Locality Name (eg, city)
organizationalUnitName	= Organizational Unit Name (eg, section)
commonName = Common Name (eg, YOUR name)
commonName_max	= 64
emailAddress = Email Address
emailAddress_max = 40

[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req

[v3_req] 
keyUsage = keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth
subjectAltName = @alt_names

[alt_names]
DNS.1 = rogerhub.com
DNS.2 = *.rogerhub.com

Under the last block, you can insert as many domains with wildcards as you want. To do this, use the following command to generate your CSR:

$ openssl req -new -key SITE.key -out SITE.csr -config SITE.cnf

Now, run the following to generate your wildcard HTTPS certificate, instead of the last command above:

$ openssl x509 -req -days 9999 -in SITE.csr -CA rootCA.pem -CAkey rootCA.key -CAcreateserial -extensions v3_req -out SITE.crt -extfile SITE.cnf

The v3_req block above is a HTTPS extension that allows certificates to work for more than one website. One of the flags in the certificate creation command is CAcreateserial. It will create a new file named rootCA.srl whose contents are updated every time you sign a certificate. You can use CAcreateserial the first time you sign a website certificate, but thereafter, you will need to provide that serial file when you sign more certificates. Do this by replacing -CAcreateserial with -CAserial rootCA.srl in the final command. A lot of the concepts here only hint at the greater complexity of HTTPS, openssl, and cryptography in general. You can learn more by reading the relevant RFCs and the openssl man pages.

Backing up my data as a linux user

Roger Chen — Tue, 16 Jul 2013 23:10:19 +0000

It’s a good habit to routinely back up your important data, and over the past few years, dozens of cloud storage/backup solutions have sprung up, many of which offer a good deal of free space. Before you even start looking for a backup solution, you need to sit down and think about what kind of data you’re looking to back up. Specifically, how much data you have, how often it is changed, and how desperately do you need to keep it safe?

I have a few kinds of data that I actively backup and check for integrity. (Don’t forget to verify your backups, or there isn’t any point in backing them up at all.) Here are all the kinds of data that might be on your computer:

Program code – irreplaceable, small size (~100MB), frequently updated
Documents, and personal configuration files – irreplaceable, small size (~100MB), regularly updated
Personal photos – mostly irreplaceable, large size (more than 10GB), append only
Server configuration and data – mostly irreplaceable, medium size (~1GB), regularly updated
Collected Media – replaceable if needed, medium size (~1GB), append only
System files – easily replaceable, medium size (~1GB), sometimes updated

Several backup solutions try to backup everything. This is not a good idea. First, there are a lot of files on your computer that are easily replaceable (system files) and others that you’d rather not keep in your backup archives (program files). Second, those solutions have no way of giving extra redundancy to the things that matter most, and less redundancy to things that matter less.

In addition to these files, here are some types of data that you might not usually think about backing up:

Email
RSS and Calendar data
Blog content
Social networking content

My backup solution is a mix consisting of free online version control sites, Google, Dropbox, and a personal file server. My code, documents (essays, forms, receipts), and configuration (bash, vim, keys, personal CA, wiki, profile pictures, etc..) are the most important part of my backup. I sync these with Insync to my Google Drive, where I’ve rented 100GB of cloud storage. My Google Drive is regularly backed up to my personal file server, with about 2 weeks of retention.

Disks and old computers are cheap. Get a high-availability file server set up in your home, and you can happily offload intensive tasks to it like virtual machines, backup services, and archival storage. Mine is configured with:

Two 1TB hard drives configured in RAID-1 mirroring
Excessive amounts of ram and processing power, for a headless server
Ubuntu Server installed with SMART monitoring, nagios, nginx (for some web services), a torrent client
Wake-on-lan capabilities

Backing up your Google Drive might sound funny to you, but it is a good precaution in case anything ever happens to your Google Account. Additionally, most of my program code is in either a public GitHub repository or a private BitBucket repository. Version control and social coding features like issues/pull requests give you additional benefits than simply backing up your code, and you should definitely be using some kind of VCS for any code you write.

For many of the projects that I am actively developing, I only use VCS and my file server. Git object data should not be backed up to cloud storage services like Google Drive because they change too often. My vim configuration is also stored on GitHub, to take advantage of git submodules for my vim plugins.

My personal photos are stored in Google+ Photos, formerly known as Picasa. They give you 15GB of shared storage for free, and if that’s not enough, additional space is cheap as dirt. My photos don’t have another level of redundancy like my code and configuration files do. They are less important to me, and Google can be trusted to sustain itself longer than any backup solution you create yourself.

I host a single VPS with Linode (that’s an affiliate link) that contains a good amount of irreplaceable data from my blogs and other services I host on it. Linode itself offers cheap and easy full-disk backups ($5/mo.) that I signed up for. Those backups aren’t intended for hardware failures so much as human error, because Linode already maintains high-availability redundant disk storage for all of its VPS nodes. Additionally, I backup the important parts of the server to my personal file server (/etc, /home, /srv, /var/log), for an extra level of redundancy.

Any pictures I collect from online news aggregators is dumped in my Google Drive and shares the same extra redundancy as my documents and personal configuration files. Larger media like videos are stored in one of my USB 3.0 flash drives, since they are regularly created and deleted.

I don’t back up system files, since Xubuntu is free and programs are only 1 package-manager command away. I don’t maintain extra redundancy for email for the same reason I don’t for photos.

A final thing to consider is the confidentiality of your backups. Whenever you upload data to a free public cloud storage service, you should treat the data as if it were being anonymously released to the public. In other words, personal data, cryptographic keys, and passwords should never be uploaded unencrypted to a public backup service. Things like PGP can help in this regard.

Tumblr’s Phishing Protection Code

Roger Chen — Sat, 20 Apr 2013 22:30:25 +0000

At the top of every Tumblr user’s blog, there’s a piece of JavaScript inserted by Tumblr itself. In general, Tumblr is very generous about the control they give you over your blog’s appearance. They don’t insert any advertisements or enforce any global content other than a Quantcast visitor analytics script, follow/dashboard controls on the upper right, and this script. I’ve posted the first few lines here:

(function(){var a=translated_warning_string;var b=function(d){d=d||window.event;var c=d.target||d.srcElement;if(c.type=="password"){if(confirm(a)){b=function(){}}else{c.value="";return false}}};setInterval(function(){if(typeof document.addEventListener!="undefined"){document.addEventListener("keypress",b,false)}},0)})();

This isn’t particularly clever or difficult to understand, but it does utilize several good ideas in JavaScript programming, so I’d like to go over it. First of all, you’ll notice that this extract is a single line of code that’s surrounded by (function() { ... })();. Javascript treats functions as first-class citizens. They can be declared, reassigned, and called inline. There are several advantages of first-class functions in any dynamically-typed language:

The ability to create anonymous functions is particularly helpful when you don’t want to clog up your global namespace with function names that are only used in one part of your code.
They can be created dynamically using, what’s known as a closure in many languages. Closures take their variables from the environment frame in which they were created, so you can generate them on-the-fly in a loop, in an event handler, or as part of a callback.
Even if you don’t need or want any of the fancy benefits listed so far, it’s useful that local variables declared in first-class functions are destroyed when the function quits. That way, you can use names like a or _ without polluting your global namespace.

I’ve reformatted the statements inside the function here:

var a = translated_warning_string;
var b = function(d){
  d = d||window.event;
  var c = d.target||d.srcElement;
  if (c.type == "password"){
    if(confirm(a)){
      b = function(){}
    } else {
      c.value = "";
      return false
    }
  }
};
setInterval(function(){
  if (typeof document.addEventListener != "undefined"){
    document.addEventListener("keypress",b,false)
  }
}, 0)

The first line references the global translated_warning_string variable that is declared in the blog’s HTML itself. Its contents should hint at the goal of this code. (Although it looks like not all languages are supported yet.) Assigning the variable to a means that it can be reused without making the code too long, but it seems that the variable is only actually referenced once here.

Variable b gets assigned to a function in the same way we manipulated functions earlier. The d = d||window.event; is a metaphor for if d is garbage use window.event instead, or else, leave it alone. It is a handy way to specify a default value in variable assignment when you’re uncertain whether the preferred value will work or not. In this case, d defaults to window.event which is a hack for older versions of Internet Explorer. (more on this later)

I was kidding when I said later. Here’s more on it now. Skip ahead a few lines and peek over at this part of the code:

if (typeof document.addEventListener != "undefined"){
  document.addEventListener("keypress",b,false)
}

The function we declared earlier, b, is used as the event listener for the document’s keypress event. It is triggered whenever someone presses a key while on the webpage (actually a little more complicated than this). Event listeners are functions that are supposed to be called with a single parameter, the event’s event parameter, which contains information about the event that triggered the call. This is not the case in older versions of Internet Explorer, hence the d = d||window.event; statement earlier.

When a keypress happens, the event actually travels down the tree to the target element first, from the root to the leaf (document → html → body → …). It gives each of these elements a chance to “capture” the event and handle it before it reaches its destination. This is known as event capturing, and is specified by the third argument to EventTarget.addEventListener. The other more commonly used and default behavior is to let the event bubble up if the destination doesn’t handle it. The event will bubble up until one of the destination’s ancestors catches it and stops the bubbling.

The code examined here chooses not to capture the event on the way down. (If it did, then all events would have to go through this handler.) The trade-off to not interfering with events that already have keypress handlers is that the behavior can be easily overridden by rogue sites (although this is easily detectable).

Back to where we were, we now see that variable d is an event, so c is its EventTarget, and also a dom-tree node. The variable-assignment-with-default trick is used again here.

if (c.type == "password"){
  if(confirm(a)){
    b = function(){}
  } else {
    c.value = "";
    return false
  }
}

If the event’s source is a password input, raise a confirmation dialog with the text stored in a, the translated confirmation string. If the confirmation passes, then b is set to an empty anonymous function, function(){}. Recall that b was previously defined to be another anonymous function and was also used in our event listener. Clearing this variable after the first successful confirmation prevents the script from prompting the user more than once, which is a good idea.

If the user rejects the confirmation, then the password input is cleared and the keyevent is halted with return false. Note that JavaScript dictates that the last semicolon in a block is optional, although it is usually good style to include it anyway.

Finally, note that the event listener is wrapped inside a setInterval(function() { ... }, 0). setInterval is a built-in Javascript function that runs a piece of code regularly. The second parameter is the delay in between calls, specified in milliseconds. In this case, 0 milliseconds is used, but the web browser imposes a lower limit on this delay.

The function’s contents checks the type of document.addEventListener. This function is part of the DOM that is set up near the beginning of every page load. Once the event infrastructure is available, the listener is attached. A more common way to achieve the same affect is to attach an event listener to the window object’s onload function, which is usually achieved through one of many Javascript libraries, although this solution is appropriate in the context of this script.

Protecting yourself on open wifi with Firefox

Roger Chen — Mon, 25 Mar 2013 20:32:14 +0000

So, I’m sitting in the back of Brewed Awakening right now in the midst of café-goers, some of which I know must be sniffing packets from the several overlapping open wifi networks around this dense part of campus. The spread of free wifi access points is an excellent direction for humanity, but it comes with its risks. Unless you’re browsing through HTTPS, anybody with a capable wifi network adapter can sit innocuously across the café and record everything you’re transmitting and receiving on your laptop, tablet, or smartphone. It may not seem immediately concerning that strangers know what kind of sick forums you frequent, but it becomes a security issue when you start transmitting your passwords in the clear (you know, the same ones you use for banking and email).

It just so happens that Linode, the hosting company that hosts the RogerHub network, recently announced a 10x increase in bandwidth caps for all their clients, and Opera announced their decision to move their browser line over to Webkit, sparking conversation about our complacency with Safari/Chrome. These two together motivated me to return to Firefox and recapture their vast add-on ecosystem to try to address this issue.

ssh -D 1025 rogerhub

You’re probably familiar with ssh‘s ability to forward TCP and erect ad-hoc SOCKS proxies. If not, you should definitely check out man ssh and read through the -L and -D flags. This command sets up a SOCKS proxy on localhost:1025 (ports below 1024 are privileged and can only be bound by root) through which you can forward web traffic and stuff.

Now, Google Chrome’s proxy support isn’t great for two reasons: first, they’ve had corporate adoption in mind since the beginning, so Chrome has historically read proxy settings from the environmental variables/group policy/system configuration, whatever it is on your system. In their Windows version, there’s also a UI to set proxy settings, but it isn’t their main focus. Second, their extension API doesn’t allow for the same kind of deep integration with the UI and with the program internals as Firefox’s add-on environment allows.

So, I opened up Firefox and installed FoxyProxy, the popular proxy-switching add-on, and configured it with the SSh proxy. I also pulled in NoScript for the sake of locking down the browser itself.

NoScript comes with a bunch of draconian defaults, and isn’t very useful without a bit of configuration (you could, for example, just turn off scripts in Firefox instead of keeping its defaults). Enabling same-origin scripts (Base 2nd level domains works well for me) will let most sites and their CDN subdomains to function the way they were meant while ignoring the GA, Facebook, and ad network trackers. Of course, this isn’t very good practice for casual browsing, but open wifi is a battlefield.

Altogether, it makes for enough security to give you peace of mind while browsing on public wifi. Firefox, as a browser, has really improved over the years as well, especially its web developer tools which (back in the day) once consisted of an Error Console and the imperative to install/learn Firebug. Also, splitting search and location just makes sense.

Same origin policy and a buggy WordPress plugin

Roger Chen — Wed, 06 Mar 2013 21:46:02 +0000

Update

I don’t use the plugin mentioned in this post anymore.

On this blog, I use the Crayon syntax highlighter for WordPress to render all the code snippets, since this is a programming blog after all. Crayon is one of the more popular highlighting plugins, as clearly demonstrated by the sad condition of its support forum. It comes with a bunch of color schemes including Ethan Schoonover’s extremely popular “solarized” color scheme and a replica of what Github uses (although the plugin’s version contains a tad bit more purple). A light pastel blue would have fit with this blog’s color scheme quite nicely, but definitely not purple. So, I dove deeper.

Crayon stores its highlighting themes as plain CSS in WordPress’s upload directory. That’s the same place that photos and other media go for your posts. I can think of a couple good reasons why this decision makes more sense than loading the CSS directly from the plugin’s folder:

WordPress has (by necessity) file permissions to write to the uploads directory, which means that if the plugin ever needs to customize themes, it can do that internally.
There are hooks for CDN’s and such that mirror WordPress’s upload directory, but may not do the same for plugins and other things.
On network sites, customized color schemes can be site-specific although the plugin is installed and activated network-wide.

However, there’s also one really bad effect (that likely applies only to me) of this, which I spent a good deal of the afternoon debugging. I run nginx on the backend, and its configuration divides WordPress into two regions: one contains wp-login.php and wp-admin, and the other, everything else. This way, I can restrict WordPress cookies to only the former, which is run over SSL using a self-signed certificate, and avoid having the issue where Google indexes everything twice. (Visitors shouldn’t be reading blogs over https anyway.)

It would be nice if everything could be so cleanly divided this way, but wp-admin also uses file uploads like image thumbnails, which is why there are all the mixed-content warnings you may have seen if you’ve ever administered anything through a web panel over https. Browsers complain when you try to load pages with mixed content, but they will let you do so anyway. Things are different, however, when the request is made after the page has loaded, through AJAX.

There’s an inherent security problem with letting a page script load whatever resources it wants to from anywhere on the Internet. This issue is addressed by the same-origin policy framework, wherein the server with the resource (as opposed to the one containing the web page) gets to decide if access is allowed.

HTTP/1.1 200 OK
Access-Control-Allow-Origin: http://example.com

The header takes either a list of origins, null, or the wildcard *. It’s ultimately up to the browser to implement this security header, but most modern ones do.

Now, here’s the twist: Crayon uses the built-in WordPress function, wp_upload_dir(), to find and load custom themes. If your server is set up like mine, this will return a http scheme address, whereas a https address is required. Without a check to see if the page is loaded through SSL, the plain-text AJAX call will fail because browsers treat the SSL and the plain-text version of a site as different origins.

There are a couple ways to go about this. You can:

Disable SSL temporarily while you’re editing the custom color themes (probably the easiest).
Instruct the browser to ignore same-origin policy.
Patch the plugin code to use https where appropriate.

I ended up going with the third, after unsuccessfully attempting to send a access-control-allow-origin header with every response on the server-side. This was a rather frustrating issue to resolve, but it was interesting to see the kinds of problems that arise in exchange for progress in security.