TCP PAWS extension breaks RIPE WHOIS lookups when behind NAT

Posted 12/03/2012 22:51

Background Info

For the last few weeks I have been encountering a strange problem with making IP WHOIS queries against the RIPE database, which covers all European IPs.

I first encountered the problem during a routine server upgrade and reboot. Suddenly some of our software that we run on these servers started producing errors saying that WHOIS lookups could not be performed.

After some investigation it transpired that it was only IP WHOIS lookups against the RIPE database that were failing. What is more, it was only happening on a couple of our servers, even though they all sat behind the same shared firewall and are Source NATed to the same public IP.

As time went by I upgraded more servers, and each time the newly upgraded server also started exhibiting the behaivour. Naturally my first thought was that something in the kernel upgrades that had warranted the reboot were to blame.

I began to downgrade some of the servers to their previous kernel versions, but this did not fix the issue. Stranger still some of the servers running the new kernels started working again, but intermittently!

Break out TCPDUMP

To try and understand what was going on I started running tcpdump on the firewall server to try and see the difference between a working server and a non-working server.

The results of a working server looked like this:
13:14:24.291132 IP x.x.x.x.40474 > 193.0.6.135.43: S 1723346221:1723346221(0) 
win 5840 mss 1460,sackOK,timestamp 3097608306 0,nop,wscale 4
The results of a non-working server looked like this:
12:58:26.886531 IP x.x.x.x.47159 > 193.0.6.135.43: S 9443771:9443771(0) 
win 5840 mss 1460,sackOK,timestamp 2068177 0,nop,wscale 4

Initially, the the packets looked the same with nothing obviously wrong.

The only thing that was different was that the timestamp of the newly rebooted non-working server packet was much lower than the server that had been running for months and was able to perform WHOIS lookups fine.

Surely this is perfectly acceptable, even behind NAT, because TCP connections use packet sequence numbers, not timestamps to order packets? If this wasn't the case, surely NAT would break things all the time?

As it turns out (after much searching) there is an extension to TCP called PAWS (Protect Against Wrapped Sequences) that is designed to prevent older packets from the same connection interfering with current TCP communication when using high bandwidth and high latency links.

Unfortunately it seems that the RIPE network has PAWS enabled, and it seems when making WHOIS requests from multiple servers behind the same public IP causes packets to be dropped because they have the conflicting combinations of timestamp and sequence numbers.

The Resolution

The resolution to this problem turned out to be very simple, disable TCP timestamps in our outgoing packets.

sysctl net.ipv4.tcp_timestamps=0

This means that PAWS cannot operate, and then immediately all the servers were able to perform WHOIS lookups with no problems.

Interestingly, enabling PAWS on your network can potentially introduce a DOS attack vector, by the attacker forging a packet to set a host's timestamp artificially high, and preventing future genuine communication.

Freeswitch Text-To-Speech Caching with Cepstral and LUA

Posted 13/11/2011 15:05

Recently I have been working on a project using software called Freeswitch, which is an excellent open source SIP server.

The project required the use of a text-to-speech (TTS) speech engine called Cepstral.

However Cepstral's product suffers with concurrency problems when used with many concurrent phone calls. Additionally there is about a 1 second delay before TTS audio actually starts to play, which can be off-putting for the callers.

To overcome these issues I have implemented a caching mechanism using Freeswitch's built in integration with the LUA scripting language.

Our system tends to 'say' the same things over and over again, so by caching the TTS output to a wav file this allowed Freeswitch to just play back the sound file, rather than generate the same audio over and over again.

The TTS Cache Script

The script below should be installed into the scripts directory in Freeswitch, commonly /opt/freeswitch/scripts/tts_cache.lua.

-- This script generates a wav file of the sentence passed in to it.
-- It uses the Cepstral swift command to perform text-to-speech conversion.
-- If the wav file already exists for this sentence, then it is not
generated.
api         = freeswitch.API();
msg         = argv[1];
msgMd5      = api:execute( "md5", msg );
filename    = '/var/lib/tts_cache/' .. msgMd5 .. '.wav';
cmd         = '/opt/swift/bin/swift';

-- Set a channel variable so that we know which file to play back.
session:setVariable( 'tts_file', filename );

-- Check whether the file already exists.
file, errMsg = io.open( filename, "r" )
if not file then
api:execute( 'system', cmd .. ' -o "' .. filename .. '" "' .. msg .. '"' );
end

Using The TTS Cache Script

To use the script, first create a directory to save the cache files in and ensure Freeswitch can write to it:

mkdir /var/lib/tts_cache
chown freeswitch /var/lib/tts_cache

Next, create a phrase macro to allow you use it within a dial plan or IVR setting, commonly this goes in /opt/freeswitch/conf/lang/en/ivr/tts_cache.xml:

<include>
  <!--Provides a phase to speak custom text-->
  <macro name="tts_cache">
    <input pattern="(.*)">
      <match>
        <action function="execute" data="lua(tts_cache.lua '$1')"/>
        <action function="play-file" data="${tts_file}" />
      </match>
    </input>
  </macro>
</include>

Finally, in your dial plan you can use this script as so:

<action name="phrase" data="tts_cache,Hello World" />

Now the first time the phrase "Hello World" is requested, it is passed into the Cepstral swift command, which generates a wav file, and then when ever the same phrase "Hello World" is requested in the future, Freeswitch will just playback the wav file, which is much quicker.

Rdiff Backup

Posted 01/05/2011 11:40

I have recently been investigating backup solutions at work. I have been trying out rdiff-backup, which is a tool that allows you to remotely synchronise files and maintain snapshots of previous versions.

I have written up my notes in the Wiki.

Prevent Dog Pile In NginX

Posted 17/10/2010 11:48

According to Urban Dictionary, a Dog Pile is:

A group of people jumping on on person and creating a tower of people while crushing the people on bottom.

This term is used in the area of cache invalidation too. It means that when a cached object expires, it causes a rush to refresh the object from source, which can sometimes lead to the source being overwhelmed with requests.

NginX has protection against this when using its reverse proxy cache feature. More in the wiki.

Convert M4A files to MP3

Posted 13/10/2010 21:49

Apologies for the lack of updates recently. I have been really busy with a new product at work.

Anyway, today's post is a quick article on converting M4A files to MP3 files for usage on ageing portable players (like mine).

More in the wiki.

Apache logging to central syslog server

Posted 03/07/2010 09:58

Apache web server traditionally writes to local log files in /var/log/httpd.

At work we have been looking into PCI compliance, and it requires that log files are stored centrally so that if a server gets compromised and the local log files are modified, there is still an authoritative copy on the central log server.

Syslog is the standard Linux and UNIX way for transmitting log entries to a central server.

Problem is Apache only supports logging to syslog for it's error log, and not it's access log.

Thankfully the problem is relatively easy to fix with a short Perl script.

Apache Log Config

I am doing this on a RedHat server, so the config file locations will be specific to that.

Create a new file called /etc/httpd/conf.d/syslog.conf:

LogLevel warn
LogFormat "%v %V %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" vcombined
CustomLog "|/usr/bin/httpd_syslog" vcombined
ErrorLog syslog:local2

Next create a Perl script /usr/bin/httpd_syslog to accept piped logs from Apache and send them to the local syslog service:

#!/usr/bin/perl
use strict;
use Sys::Syslog qw( :DEFAULT setlogsock );
setlogsock('unix');
openlog('httpd', 'cons,pid','local1');

#Read from STDIN and log to syslog
while (my $log = <STDIN>) {
        syslog('notice', $log);
}
closelog();

You can now reload Apache and both error logs and access logs should start flowing into /var/log/messages.

Syslog Config

Next up, we need to split out the error logs and access logs into their own files again, rather than have them appearing in the main /var/log/messages file.

To do this we need to modify /etc/syslog.conf:

Comment out the line:

*.info;mail.none;authpriv.none;cron.none  -/var/log/messages

And replace it with:

*.info;mail.none;authpriv.none;cron.none;local1.none;local2.none  -/var/log/messages
local1.*  -/var/log/httpd_access_log
local2.*  -/var/log/httpd_error_log

This tells the local syslog service not to log entries from local1 and local2 facilities (which we are using for Apache) into /var/log/messages, and instead log them into separate log files.

Central Syslog Server

To instruct the local syslog server to send all entries to a central server, add the following line to /etc/syslog.conf:

*.* @IP_ADDRESS_OF_LOG_SERVER

Now restart syslog:

service syslog restart

Log Rotation

Finally, ensure that the new log files get rotated when other syslog files are rotated by modifying /etc/logrotate.d/syslog, modify the top line so it looks like this:

/var/log/messages /var/log/secure /var/log/maillog 
/var/log/spooler /var/log/boot.log /var/log/cron 
/var/log/httpd_access_log /var/log/httpd_error_log {

Git and Github in the workplace

Posted 18/05/2010 20:05

Today we started using Git. We are using Github to host our repositories for a private project. Previously we have been an SVN shop and have built various deployment systems around it.

I have encountered the following issues, and will separate them into two sections; those relating to Git as a tool, and those relating to Github as a service.

Github Issues

User management and access control

As a Systems Administrator and Developer I spend my time both writing applications and maintaining services for other developers. One of these services is version control. In the past we used SVN and setup a master password list containing users for each developer. It wasn't perfect, but because it was hosted internally, the only way to access it from outside was using a VPN that authenticated against the master Active Directory server.

Git as a tool seems to provide good user management and access control by utilising SSH, where each user can have either a separate system account, or login to a shared account using an SSH key.

In the event that an employee left and needed their account revoked, simply removing that user or their key would suffice. In addition because it would be hosted internally, any access would be done through the VPN, and presumably their AD account would also be shut down.

However with Github, which encourages 'social coding', the access control and user management is severely limited. Firstly, each user has their own Github account that is in no way related to the company's project account. They are then invited to join a company repository as a 'collaborator'. Unfortunately this is done on a per-repository basis, meaning that removal of a user must be done manually for each.

No prizes for guessing for who is going to get lumbered with that task!

Archive Support

Github have disabled archive support in Git, meaning that you cannot export a tagged release for packaging. Instead you have to clone or pull a repo, and then package it. This is inefficient as it also pulls down all commit changes, which are not necessary when packaging software.

Git Issues

The first issue relates to the previous issue. This is extracting code from a repo for packaging without also downloading all the commit changes.

There is no equivalent of svn export, and whilst archive appears to be what I need, Github does not support this. I don't know if this is a problem with Git or Github, either way it is annoying.

The other issue is complexity. Granted I am new to all this distributed version control stuff, so will stick with it. But when I started using SVN it just clicked and immediately made my life easier. I'm not sure when I will be needing Git's advanced features, but currently Git itself doesn't seem to offer any benefits to my work flow (although Github's code review is useful).

Conclusion

So all in all, not too bad of a start with Git. Although at the moment I am not using the full power of Git (i.e. forking/merging) as I have absolutely no need for it and would prefer to spend my time coding than messing with version control.

I am currently just pushing back to the master repository at Github, and although this isn't very 'cool', it suits my needs well (as SVN did).

My message to those considering switching to Git would be: If you are having trouble with conflicting commits or are looking to use a hosted service like Github that gives you added extras (like code reviews) then go for it.

If on the other hand you are happy with SVN and its limitations, then stick with it, and if you are looking for the added benefits of a hosted service, try something like Unfuddle.

Threading Vs. Forking and PHP

Posted 01/03/2010 19:13

Found two excellent articles today on Threading Vs. Forking, and using the forking extension in PHP (pcntl).

Suffice to say, being a Linux user, I prefer using forking rather than threading. I have had experience with threading in Java. I guess I just enjoy the safety net that forking gives you. I also prefer that fact that you have to have a defined protocol between two processes to achieve IPC (inter process communication). Whereas in threading each thread can access shared memory directly.

StartCom Free SSL Certificates

Posted 01/03/2010 18:58

Yesterday I started using StartCom's free SSL certificate authority to replace some self-signed certificates I had previously been using on my web sites.

StartCom are, as far as I am aware, the only company who provide free 'proper' SSL certificates (I.E. ones that have their Certificate Authority certificate embedded in the majority of web browsers).

The process was fairly painless, you first have to generate an SSL certificate for your web browser. This allows you access to their control panel (no usernames and passwords).

Then to get an SSL certificate for a domain name, you have to validate that you own that domain (by sending an email to postmaster@domain.com). This took a long time to complete (5 minutes or so), so I thought it had crashed on the first few attempts.

Eventually though it completed, and after that it was a breeze.

As with anything to do with SSL, having a basic understanding about what all these keys and certificate files actually do is essential, otherwise you will quickly get confused.

PHP UK 2010 Conference

Posted 28/02/2010 18:49

On Friday Feb 25 2010 I visited the PHP UK 2010 Conference in London with two of my colleagues from work.

This was my first PHP conference, and I enjoyed the presentations. Probably the most useful to me was the talk given by Sticky Eyes on optimising MySQL and Message Queues for a high traffic SEO agency.

This talk mentioned Beanstalkd, which is an open source message queue. I remember looking at this application last year, however it did not have persistence at that time. Now it does, so am seriously considering implementing it at work.

My other favourite talk was given by IBuildings on implementing Web Services. I am a supporter of RESTful APIs as they appeal to me as a simple and pragmatic way to expose services to other processes, without the complexity that SOAP brings.

The lunch supplied wasn't great, but the free beers at the end of the day provided by Facebook were much appreciated!