EXIF Remover
It would seem that some websites have started to use over-zealous (and faulty) detection algorithms when uploading images. I have had problems uploading images that are rotated 90 degrees, because the websites in question decides to look at the EXIF data in the JPEG images instead of the actual bitmap, and then counter-rotates the images! To fix this problem, I have made a simple program to just strip away all the EXIF data in a JPEG image.
Here is the C code:
#include <stdio.h> int main(void) { int c, jpeg_marker_found, length_low, length_high, in_exif; jpeg_marker_found = in_exif = 0; while ((c = fgetc(stdin)) != EOF) { if (in_exif > 0) { in_exif--; continue; } if (jpeg_marker_found && c == 0xE1) { /* APP1 marker, used by EXIF. */ if ((length_high = fgetc(stdin)) == EOF) return 1; if ((length_low = fgetc(stdin)) == EOF) return 1; /* Remove one byte to avoid printing the next marker's first 0xFF... */ in_exif = length_low + (length_high * 0x100) - 1; /* ...and this marker's 0xE1 will also not be printed. */ } else { fputc(c, stdout); } jpeg_marker_found = (c == 0xFF) ? 1 : 0; } return 0; }
TCP proxy with Netcat
Here is an interesting way to setup a generic TCP proxy using the netcat tool. What makes this method interesting is that traffic is not simply forwarded, but also sent back the other way. In order to do this, we require a two Unix pipes. One is created using the shell's own mechanism, and the other one is created manually as a named pipe (also known as a fifo).
Imagine we want to connect with telnet to some remote host, through the proxy like this:
local-host <-> proxy-host <-> remote-host.
On the proxy-host, enter this on the shell:
mkfifo /tmp/backpipe nc -l -p 31337 < /tmp/backpipe | nc remote-host 23 > /tmp/backpipe
On the local-host, you can now enter this...:
telnet proxy-host 31337
...to reach port 23 on the remote host, and the traffic will flow in both directions.
Base64 Decoder and Encoder
I looked through my old code collection and found the source code for these two small tools that will encode and decode Base64 files, in the MIME compatible format.
Here is the decoder code:
#include <stdio.h> enum { MIME_62 = '+', MIME_63 = '/', MIME_PADDING = '=', MIME_ERROR_NUMBER = 64, /* Not valid base64 number. */ }; /* mime_to_base64: Convert MIME compatible ASCII character to number. */ int mime_to_base64(int n) { if (n >= 0x60) /* Lowercase letters */ return n - 0x47; else if (n >= 0x40) /* Uppercase letters */ return n - 0x41; else if (n >= 0x30) /* Numbers */ return n + 0x4; else if (n == MIME_62) return 62; else if (n == MIME_63) return 63; else return MIME_ERROR_NUMBER; /* Unkown (Skip) */ } /* main: Input and output filter. */ int main(void) { int c, n, i; unsigned char ascii[3]; /* 24-bit buffer */ unsigned char base64[4]; /* 4 characters */ n = 0; while ((c = fgetc(stdin)) != EOF) { if ((base64[n] = mime_to_base64(c)) == MIME_ERROR_NUMBER) continue; /* Unkown character (e.g. newline), skip it. */ if (c == MIME_PADDING) /* No need to get more data. */ break; if (n == 3) { /* Buffer full, time to output. */ ascii[0] = base64[0] << 2; ascii[0] += (base64[1] & 0x30) >> 4; /* 110000 */ ascii[1] = (base64[1] & 0xf) << 4; /* 001111 */ ascii[1] += ((base64[2] & 0x3c) >> 2); /* 111100 */ ascii[2] = ((base64[2] & 0x3) << 6); /* 000011 */ ascii[2] += base64[3]; for (i = 0; i < 3; i++) { fputc(ascii[i], stdout); } n = 0; } else n++; } /* Check for remaining data in buffer. */ if (n == 2) { /* Two paddings, one character missing. */ ascii[0] = base64[0] << 2; ascii[0] += (base64[1] & 0x30) >> 4; fputc(ascii[0], stdout); } else if (n == 3) { /* One padding, two characters missing. */ ascii[0] = base64[0] << 2; ascii[0] += (base64[1] & 0x30) >> 4; ascii[1] = (base64[1] & 0xf) << 4; ascii[1] += ((base64[2] & 0x3c) >> 2); fputc(ascii[0], stdout); fputc(ascii[1], stdout); } return 0; }
And here is the encoder code:
#include <stdio.h> enum { MIME_62 = '+', MIME_63 = '/', MIME_PADDING = '=', }; /* base64_to_mime: Convert number to MIME compatible ASCII character. */ int base64_to_mime(int n) { if (n == 63) return MIME_63; else if (n == 62) return MIME_62; else if (n > 51) /* Numbers */ return n - 0x4; else if (n > 25) /* Lowercase letters */ return n + 0x47; else /* Uppercase letters */ return n + 0x41; } /* main: Input and output filter. */ int main(void) { int c, n, i; unsigned char ascii[3]; /* 24-bit buffer */ unsigned char base64[4]; /* 4 characters */ n = 0; while ((c = fgetc(stdin)) != EOF) { ascii[n] = c; if (n == 2) { /* Buffer full, time to output. */ base64[0] = ascii[0] >> 2; base64[1] = (ascii[1] >> 4) & 0xf; /* 00001111 */ base64[1] += ((ascii[0] & 0x3) << 4); /* 00000011 */ base64[2] = (ascii[1] << 2) & 0x3c; /* 00111100 */ base64[2] += ((ascii[2] & 0xc0) >> 6); /* 11000000 */ base64[3] = ascii[2] & 0x3f; /* 00111111 */ for (i = 0; i < 4; i++) { fputc(base64_to_mime(base64[i]), stdout); } n = 0; } else n++; } /* Check for remaining data in buffer. */ if (n == 1) { base64[0] = ascii[0] >> 2; base64[1] = (ascii[0] & 0x3) << 4; fputc(base64_to_mime(base64[0]), stdout); fputc(base64_to_mime(base64[1]), stdout); fputc(MIME_PADDING, stdout); fputc(MIME_PADDING, stdout); } else if (n == 2) { base64[0] = ascii[0] >> 2; base64[1] = (ascii[1] >> 4) & 0xf; base64[1] += ((ascii[0] & 0x3) << 4); base64[2] = (ascii[1] << 2) & 0x3c; fputc(base64_to_mime(base64[0]), stdout); fputc(base64_to_mime(base64[1]), stdout); fputc(base64_to_mime(base64[2]), stdout); fputc(MIME_PADDING, stdout); } return 0; }
Image Scaler Script
When uploading images taken with a digital camera to the interwebs, I always want to scale/resize it to a smaller size. To automate this process is of course a no-brainer, but the trick is to consider the aspect of the image. I like to use the size 800x600, but images taken with a 90 degree angle on the camera should be resized to 600x800 instead.
This bourne shell script uses ImageMagick to do the job:
#!/bin/sh for i in "$@"; do INFO=`identify "$i"` || continue; WIDTH=` echo "$INFO" | sed -r -e 's/.* ([0-9]+)x[0-9]+ .*/\1/'` HEIGHT=`echo "$INFO" | sed -r -e 's/.* [0-9]+x([0-9]+) .*/\1/'` if [ $WIDTH -ge $HEIGHT ]; then echo "$i: W > H" mogrify -scale 800x600 "$i" else echo "$i: H > W" mogrify -scale 600x800 "$i" fi done
Duplicate File Remover
Here is a long bourne shell one-liner I hacked together to (re)move duplicate files from a directory structure:
mkdir "./duplicate" && find . -type f -exec md5sum {} \; | \ sort | uniq -D -w 32 | awk '{ print $1, length, $2 }' | \ sort -n | awk '($1 == x) { print $1, $3 } ($1 != x) { x = $1 }' | \ cut -b 33- | xargs -I {} mv -v {} "./duplicate"
I could have used perl or python, but that is not as fun or challenging!
The weirdest part is the use of awk to print the length of the line in between the MD5 sum and the filename. This is required to be able to sort so that the files with the longest paths are printed last. This in turn makes sure that files deeper down in the directory structure will be moved instead. The other awk part will remove the first line in a series of identical MD5 sums, this is required because one file will of course have to remain in the directory structure!
I prefer to move (mv) the files instead of actually removing (rm) them, so everything can be double checked afterwards.
By the way, this one-liner can be used to remove any remaining empty directories in the structure:
find . -type d | sort -r | xargs rmdir --ignore-fail-on-non-empty
UDP P2P File Transfer
This is a highly experimental program to transfer files from one host to another using only UDP. It is not very fast (maybe about 4-5 times slower than a FTP transfer over TCP), but I can imagine that it may be useful in some rare cases where for instance TCP communications are blocked by a firewall. The program principle is similar to that of TFTP transfers, but this program contains a user interface to retrieve file listings, etc. Also, every program instance is both a client and a server, so it is more geared towards being a P2P application.
Be warned that the program probably contains a few bugs, and has at least the following known errors and problems:
- Files can become corrupted if some other source injects false UDP packages.
- Possible endianness problems. I had no way to test this, so there may be issues when using the application from a computer utilizing big endian.
- There is no check to prevent the same file from being shared twice.
- If the server changes a file, this is not detected by the client.
I have released the code under a MIT License here.