Information wants to be free...

XSPF Integrity Check

This is a Python script to check the integrity of XSPF playlist files. By integrity check, I mean checking if the files that the playlists references actually do exist.

You may notice that I use a regular expression with a lambda expression to decode the URL encoding instead of using the standard urllib.unquote() routine. There is a good reason for this, namely that urllib.unquote() returns a "unicode" formatted string instead of a regular "str" Python string. I happen to use Latin-1 encoding on my filenames, and in order to properly decode these, the built-in decode() function must be used, but that one only works on regular "str" strings!

Anyway, here's my script:

#!/usr/bin/python

import xml.dom.minidom
import re
import os.path

def xspf_parse(playlist_filename, handler):
    xml_data = xml.dom.minidom.parse(playlist_filename)
    for playlist in xml_data.getElementsByTagName("playlist"):
        for tracklist in playlist.getElementsByTagName("trackList"):
            for track in tracklist.getElementsByTagName("track"):
                for location in track.getElementsByTagName("location"):
                    data = re.sub("%([0-9a-fA-F]{2})", \
                        lambda x: chr(int(x.group(1), 16)), \
                        location.firstChild.data.encode("utf-8"))
                    track_filename = data.decode("utf-8").replace("file://", "")
                    handler(playlist_filename, track_filename)

def file_check(playlist_filename, track_filename):
    if not os.path.isfile(track_filename):
        print playlist_filename, "-->", track_filename

if __name__ == "__main__":
    import sys

    if len(sys.argv) < 2:
        print "Usage: %s <xspf file> ... <xspf file>" % (sys.argv[0])
        sys.exit(1)

    for filename in sys.argv[1:]:
        xspf_parse(filename, file_check)

    sys.exit(0)
          


Topic: Scripts and Code, by Kjetil @ 11/12-2014, Article Link

MP3 Splitting Tool

This tool is very similar to the MP3 Cutting Tool I made some years ago, but covers different use case. The intention with this tool, is to cut away the beginning or ending of an MP3 file in a lossless manner, by splitting it at a frame boundary.

Using the tool on an MP3 file without any additional arguments will just print the total amount of frames. It will then be necessary to specify at which frame the splitting should occur, and the result will be two new (before & after the frame) MP3 files. Please note that this operation will likely mess up ID3 tags in the file, so I recommend to remove all old tags and then do a re-tagging operation afterwards.

Here is the modified source code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <limits.h>

static int bitrate_matrix[16][5] = {
  {0,   0,   0,   0,   0},
  {32,  32,  32,  32,  8},
  {64,  48,  40,  48,  16},
  {96,  56,  48,  56,  24},
  {128, 64,  56,  64,  32},
  {160, 80,  64,  80,  40},
  {192, 96,  80,  96,  48},
  {224, 112, 96,  112, 56},
  {256, 128, 112, 128, 64},
  {288, 160, 128, 144, 80},
  {320, 192, 160, 160, 96},
  {352, 224, 192, 176, 112},
  {384, 256, 224, 192, 128},
  {416, 320, 256, 224, 144},
  {448, 384, 320, 256, 160},
  {0,   0,   0,   0,   0}};

static int sampling_matrix[4][3] = {
  {44100, 22050, 11025},
  {48000, 24000, 12000},
  {32000, 16000, 8000},
  {0,     0,     0}};

static int decode_header(unsigned char *header)
{
  int version, layer, padding;
  int bitrate_row, bitrate_col, sampling_row, sampling_col;

  version = (header[1] & 0x08) >> 3; /* MPEG version. */
  layer = (header[1] & 0x06) >> 1; /* MPEG layer. */

  bitrate_row = (header[2] & 0xf0) >> 4;
  bitrate_col = -1;
  if (version == 1) {
    if (layer == 3)      /* I */
      bitrate_col = 0;
    else if (layer == 2) /* II */
      bitrate_col = 1;
    else if (layer == 1) /* III */
      bitrate_col = 2;
  } else { /* Version 2 */
    if (layer == 3)      /* I */
      bitrate_col = 3;
    else if (layer == 2) /* II */
      bitrate_col = 4;
    else if (layer == 1) /* III */
      bitrate_col = 4;
  }

  sampling_row = (header[2] & 0x0c) >> 2;
  sampling_col = (version == 0) ? 1 : 0;

  padding = (header[2] & 0x02) >> 1;

  if (sampling_matrix[sampling_row][sampling_col] == 0)
    return -1; /* Cannot divide by zero. */

  if (layer == 3) /* I */
    return (12 * (bitrate_matrix[bitrate_row][bitrate_col] * 1000) /
      sampling_matrix[sampling_row][sampling_col] + (padding * 4)) * 4;
  else if (layer == 2 || layer == 1) /* II or III */
    return 144 * (bitrate_matrix[bitrate_row][bitrate_col] * 1000) /
      sampling_matrix[sampling_row][sampling_col] + padding;
  else
    return -1;
}

static int read_frames(FILE *src, FILE *dst, int frame_limit)
{
  int c, n, frame_length, no_of_frames;
  unsigned char quad[4];

  quad[0] = quad[1] = quad[2] = quad[3] = '\0';

  frame_length = n = no_of_frames = 0;
  while ((c = fgetc(src)) != EOF) {
    if (dst != NULL)
      fputc(c, dst);
  
    if (frame_length > 0) {
      frame_length--;
      n++;

      /* While cutting the file, a frame limit is specified to stop reading. */
      if (frame_limit > 0) {
        if (frame_length == 0 && no_of_frames == frame_limit)
          return no_of_frames; /* Return early, but filehandle must be left
                                  intact by the caller to continue at the right
                                  spot! */
      }

      /* Skip ahead in stream to avoid reading garbage. */
      continue;
    }

    /* Have a potential header ready for each read. */
    quad[0] = quad[1];
    quad[1] = quad[2];
    quad[2] = quad[3];
    quad[3] = c;

    /* Match frame sync. */
    if ((quad[0] == 0xff) && ((quad[1] & 0xf0) == 0xf0)) {
      no_of_frames++;
      frame_length = decode_header(quad) - 4;
      quad[0] = quad[1] = quad[2] = quad[3] = '\0';
    }

    n++;
  }
  
  return no_of_frames;
}

int main(int argc, char *argv[])
{
  int i, no_of_frames, parts, limit;
  FILE *src, *dst;
  char filename[PATH_MAX]; /* POSIX limit. */

  if (argc != 3) {
    fprintf(stderr, "Usage: %s <mp3-file> <no-of-parts>\n", argv[0]);
    return 1;
  }

  parts = atoi(argv[2]);
  if (parts == 0) {
    fprintf(stderr, "Error: Invalid number of parts specified.\n");
    return 1;
  }

  src = fopen(argv[1], "r");
  if (src == NULL) {
    fprintf(stderr, "Error: Unable to open file for reading: %s\n",
      strerror(errno));
    return 1;
  }

  no_of_frames = read_frames(src, NULL, 0);
  if (parts > no_of_frames) {
    fprintf(stderr, "Error: More parts than available frames specified.\n");
    fclose(src);
    return 1;
  }

  rewind(src);

  for (i = 1; i <= parts; i++) {
    snprintf(filename, sizeof(filename), "%s.%02d", argv[1], i);

    dst = fopen(filename, "w");
    if (dst == NULL) {
      fprintf(stderr, "Error: Unable to open file for writing: %s\n",
        strerror(errno));
      fclose(src);
      return 1;
    }

    if (i == parts)
      limit = 0; /* Make sure all frames are read on the last part,
                    rounding errors in the formula prevents this. */
    else
      limit = no_of_frames / parts;

    fprintf(stderr, "%02d: %s: %d\n", i, filename, 
      read_frames(src, dst, limit));

    fclose(dst);
  }

  fclose(src);
  return 0;
}
          


Topic: Scripts and Code, by Kjetil @ 29/11-2014, Article Link

FLAC Tagging Helper

Here is a small Python script that helps with tagging FLAC files. It relies on the external "metaflac" tool to do this.

The idea with this script, is to combine it with a album template file that contains all the information that should be tagged for set of files in a specific directory. Since the "--no-utf8-convert" flag is used, the album file should use UTF-8 for any non-ASCII character. Each line represents a tag, and if a line is prefixed with a number and colon, the tag will only be applied for that track number. The script expects to find files that start with the track number and a dash to identify which files to tag in the directory.

Here is an example of a album template file:

ALBUM=Foo
DATE=2001
GENRE=Rock
01:TRACKNUMBER=1
01:ARTIST=Bar
01:TITLE=Baz
02:TRACKNUMBER=2
02:ARTIST=Bar
02:TITLE=Baaaaz
          


Here is the Python script itself:

#!/usr/bin/python

import re
import os

class FLACTagger(object):
    def __init__(self, flac_directory, album_file, cover_image):
        self._flac_directory = flac_directory
        self._album_file = album_file
        self._cover_image = cover_image
        self._global_tags = list()
        self._track_tags = dict()

    def read_album_file(self):
        fh = open(self._album_file, "r")
        for line in fh:
            match = re.match(r"^(\d+):(\w+=.*)$", line)
            if match:
                if not match.group(1) in self._track_tags:
                    self._track_tags[match.group(1)] = list()
                self._track_tags[match.group(1)].append(match.group(2))

            match = re.match(r"^(\w+=.*)$", line)
            if match:
                self._global_tags.append(match.group(1))

        fh.close()

    def make_flactags_files(self):
        for track_no in self._track_tags.keys():
            fh = open("/tmp/%s.flactags" % (track_no), "w")
            for tags in self._global_tags:
                fh.write(tags)
                fh.write("\n")
            for tags in self._track_tags[track_no]:
                fh.write(tags)
                fh.write("\n")
            fh.close()

    def _quote(self, filename):
        return "'" + filename.replace("'", "'\\''") + "'"

    def apply_tags(self):
        for filename in os.listdir(self._flac_directory):
            match = re.match(r"^(\d+) - ", filename)
            full_path = self._quote(self._flac_directory + "/" + filename)
            if match:
                print full_path
                os.system("metaflac --remove-all %s" % (full_path))
                os.system("metaflac --no-utf8-convert --import-tags-from=/tmp/%s.flactags %s" % (match.group(1), full_path))
                os.system("metaflac --import-picture-from=%s %s" % (self._quote(self._cover_image), full_path))



if __name__ == "__main__":
    import sys

    if len(sys.argv) < 4:
        print "Usage: %s <FLAC directory> <album file> <cover image>" % (sys.argv[0])
        sys.exit(1)
    
    if not os.path.isdir(sys.argv[1]):
        print "Error: Invalid file directory"
        sys.exit(1)

    if not os.path.isfile(sys.argv[2]):
        print "Error: Invalid album file"
        sys.exit(1)

    if not os.path.isfile(sys.argv[3]):
        print "Error: Invalid cover image"
        sys.exit(1)

    ftm = FLACTagger(sys.argv[1], sys.argv[2], sys.argv[3])
    ftm.read_album_file()
    ftm.make_flactags_files()
    ftm.apply_tags()
          


Topic: Scripts and Code, by Kjetil @ 01/08-2014, Article Link

Substitution Cipher Cryptanalysis

Here is another re-release. Several years ago, I made a tool to crack substitution ciphers. I have cleaned up the code and made some improvements.

The program uses setlocale() to modify the effect of e.g. the isalpha() and toupper() standard C functions. This makes it possible to support several languages that use more than just A to Z.

Here's what it looks like in action:

Screenshot of SCCA.

(Press F1 or F5 for help.)

Compile this code, and remember to link with the curses library:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <curses.h>
#include <limits.h>
#include <locale.h>

#define TEXT_MAX 65536
#define SAVE_EXTENSION "scca"
#define PAGE_OFFSET_SKIP 10

static int allowed_char[UCHAR_MAX];
static unsigned char cipher[UCHAR_MAX];
static unsigned char text[TEXT_MAX] = {'\0'};
static int allowed_char_len;
static int cipher_pos = 0;
static int text_offset = 0;

static void cipher_init(void)
{
  unsigned char c;

  setlocale(LC_ALL, "");

  allowed_char_len = 0;
  for (c = 0; c < UCHAR_MAX; c++) {
    if (isupper(c)) {
      allowed_char[c] = allowed_char_len;
      allowed_char_len++;
    } else {
      allowed_char[c] = -1;
    }
    cipher[c] = ' ';
  }
}

static void cipher_erase(void)
{
  unsigned char c;
  for (c = 0; c < UCHAR_MAX; c++) {
    cipher[c] = ' ';
  }
}

static unsigned char cipher_applied(unsigned char plain)
{
  unsigned char c;

  if (isupper(plain)) {
    c = allowed_char[plain];
    if (cipher[c] == ' ') {
      return plain;
    } else {
      return cipher[c];
    }
  } else {
    return plain;
  }
}

static int text_read(char *filename)
{
  int c, n;
  FILE *fh;

  fh = fopen(filename, "r");
  if (! fh) {
    fprintf(stderr, "fopen() failed on file: %s\n", filename);
    return 1;
  }

  setlocale(LC_ALL, "");

  n = 0;
  while ((c = fgetc(fh)) != EOF) {
    if (n > TEXT_MAX)
      break;
    if (c == '\r')
      continue; /* CR causes issues, just strip it. */
    text[n] = toupper(c);
    n++;
  }

  fclose(fh);
  return 0;
}

static void text_save(char *old_filename)
{
  int i;
  FILE *fh;
  static char new_filename[PATH_MAX];

  snprintf(new_filename, PATH_MAX, "%s.%s", old_filename, SAVE_EXTENSION);

  erase();

  fh = fopen(new_filename, "w");
  if (fh == NULL) {
    mvprintw(0, 0, "Could not open file for writing: %s", new_filename);
  } else {
    for (i = 0; i < TEXT_MAX; i++) {
      if (text[i] == '\0')
        break;
      fputc(cipher_applied(text[i]), fh);
    }
    mvprintw(0, 0, "Deciphered text saved to: %s", new_filename);
  }
  fclose(fh);

  mvprintw(1, 0, "Press any key to contiue...");
  refresh();

  flushinp();
  getch(); /* Wait for keypress. */
  flushinp();
}

static void display_help(void)
{
  erase();
  mvprintw(0,  0, "Left:        Move cipher cursor left.");
  mvprintw(1,  0, "Right:       Move ciiper cursor right.");
  mvprintw(2,  0, "Up:          Scroll one line up.");
  mvprintw(3,  0, "Down:        Scroll one line down.");
  mvprintw(4,  0, "Page Up:     Scroll %d lines up.", PAGE_OFFSET_SKIP);
  mvprintw(5,  0, "Page Down:   Scroll %d lines down.", PAGE_OFFSET_SKIP);
  mvprintw(6,  0, "Space:       Erase cipher character.");
  mvprintw(7,  0, "[A-Z]:       Insert cipher character.");
  mvprintw(8,  0, "F1 / F5:     Display this help.");
  mvprintw(9,  0, "F2 / F6:     Display character frequency.");
  mvprintw(10, 0, "F3 / F7:     Reset cipher. (Erase all.)");
  mvprintw(11, 0, "F4 / F8:     Save deciphered text to file.");
  mvprintw(12, 0, "F10:         Quit");
  mvprintw(14, 0, "Press any key to contiue...");
  refresh();

  flushinp();
  getch(); /* Wait for keypress. */
  flushinp();
}

static void display_frequency(void)
{
  int count[UCHAR_MAX];
  int i, y, x, maxy, maxx;
  unsigned char c;

  for (c = 0; c < UCHAR_MAX; c++) {
    count[c] = 0;
  }

  for (i = 0; i < TEXT_MAX; i++) {
    if (text[i] == '\0')
      break;
    count[text[i]]++;
  }

  erase();
  getmaxyx(stdscr, maxy, maxx);
  y = x = 0;
  for (c = 0; c < UCHAR_MAX; c++) {
    if (! isupper(c))
      continue;

    mvprintw(y, x, "%c: %d", c, count[c]);
    x += 10;
    if (x > (maxx - 10)) {
      x = 0;
      y++;
    }
  }
  mvprintw(y + 1, 0, "Press any key to contiue...");
  refresh();
  
  flushinp();
  getch(); /* Wait for keypress. */
  flushinp();
}

static void screen_init(void)
{
  initscr();
  atexit((void *)endwin);
  noecho();
  keypad(stdscr, TRUE);
}

static void screen_update(void)
{
  unsigned char c;
  int i, y, x, maxy, maxx, skip_newline;

  getmaxyx(stdscr, maxy, maxx);
  erase();

  /* Alphabet. */
  x = 0;
  for (c = 0; c < UCHAR_MAX; c++) {
    if (allowed_char[c] != -1) {
      mvaddch(0, x, c);
      x++;
      if (x > maxx)
        break;
    }
  }

  /* Cipher */
  x = 0;
  for (c = 0; c < UCHAR_MAX; c++) {
    mvaddch(1, x, cipher[c]);
    x++;
    if (x > maxx)
      break;
  }

  /* Upper Separation Line */
  mvhline(2, 0, ACS_HLINE, maxx);

  /* Text */
  skip_newline = text_offset;
  move(3, 0);
  for (i = 0; i < TEXT_MAX; i++) {
    if (text[i] == '\0')
      break;

    if (skip_newline > 0) {
      if (text[i] == '\n') {
        skip_newline--;
      }
      continue;
    }

    c = cipher_applied(text[i]);
    if (c != text[i]) {
      attron(A_REVERSE);
    }
    addch(c);
    if (c != text[i]) {
      attroff(A_REVERSE);
    }

    getyx(stdscr, y, x);
    if (y >= (maxy - 1)) {
      break;
    }
  }

  /* Lower Separation Line */
  mvhline(maxy - 1, 0, ACS_HLINE, maxx);

  move(1, cipher_pos);
  refresh();
}

static void screen_resize(void)
{
  endwin(); /* To get new window limits. */
  screen_update();
  flushinp();
  keypad(stdscr, TRUE);
  cipher_pos = 0;
}

int main(int argc, char *argv[])
{
  int c;

  if (argc != 2) {
    fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
    return 0;
  }

  cipher_init();
  if (text_read(argv[1]) != 0) {
    return 1;
  }
  screen_init();

  while (1) {
    screen_update();
    c = getch();

    switch (c) {
    case KEY_RESIZE:
      screen_resize();

    case KEY_LEFT:
      cipher_pos--;
      if (cipher_pos < 0)
        cipher_pos = 0;
      break;

    case KEY_RIGHT:
      cipher_pos++;
      if (cipher_pos > allowed_char_len)
        cipher_pos = allowed_char_len;
      break;

    case KEY_UP:
      text_offset--;
      if (text_offset < 0)
        text_offset = 0;
      break;

    case KEY_DOWN:
      text_offset++;
      /* NOTE: Nothing preventing infinite scrolling... */
      break;

    case KEY_PPAGE:
      text_offset -= PAGE_OFFSET_SKIP;
      if (text_offset < 0)
        text_offset = 0;
      break;
      
    case KEY_NPAGE:
      text_offset += PAGE_OFFSET_SKIP;
      /* NOTE: Nothing preventing infinite scrolling... */
      break;

    case KEY_F(1):
    case KEY_F(5):
      display_help();
      break;

    case KEY_F(2):
    case KEY_F(6):
      display_frequency();
      break;

    case KEY_F(3):
    case KEY_F(7):
      cipher_erase();
      break;

    case KEY_F(4):
    case KEY_F(8):
      text_save(argv[1]);
      break;

    case KEY_F(10):
      exit(0);

    case ' ':
      cipher[cipher_pos] = ' ';
      break;

    default:
      if (isalpha(c)) {
        cipher[cipher_pos] = toupper(c);
      }
      break;
    }
  }

  return 0;
}
          


Topic: Scripts and Code, by Kjetil @ 05/07-2014, Article Link

Indexed String Search

Just for experimentation, I created an indexed string search system based on hashing, implemented in Python for fast prototyping. The results were rather shocking actually. I had a hunch that it could be slow, but not this slow. The C-based string search I made earlier, that searches file by file directly, is a lot faster.

The principle for this script is to create a hashed database that holds all the words with their location. In theory, this is quicker than searching through the all files again and again. I think the main problem is the gigantic database that gets created, which is 10 times larger than the actual files to search. This takes a very long time to load into memory every time the script is called.

Anyway, here is the code. It works, but it's very slow:

#!/usr/bin/python

import os.path
import pickle

class SearchDatabase(object):
    def __init__(self):
        self._db = dict()

    def _visitor(self, arg, dirname, names):
        for name in names:
            filename = os.path.join(dirname, name)
            if os.path.isfile(filename):
                fh = open(filename, "r")
                for line_no, line in enumerate(fh):
                    location = "%s:%d" % (filename, line_no + 1)
                    for word in line.split():
                        if not word.upper() in self._db:
                            self._db[word.upper()] = dict()
                        self._db[word.upper()][location] = line.rstrip()
                fh.close()

    def create(self, directory):
        os.path.walk(directory, self._visitor, None)

    def save(self, filename):
        fh = open(filename, "wb")
        pickle.dump(self._db, fh, pickle.HIGHEST_PROTOCOL)
        fh.close()

    def load(self, filename):
        fh = open(filename, "rb")
        self._db = pickle.load(fh)
        fh.close()

    def locate_and_display(self, word):
        try:
            for location in self._db[word.upper()]:
                print "%s:%s" % (location, self._db[word.upper()][location])
        except KeyError:
            return

if __name__ == "__main__":
    import sys
    import getopt

    def usage_and_exit():
        print "Usage: %s <options> <word>" % (sys.argv[0])
        print "Options:"
        print "  -h        Display this help and exit."
        print "  -d DIR    Create database by traversing directory DIR."
        print "  -f FILE   Use FILE as database filename."
        print ""
        sys.exit(1)

    try:
        opts, args = getopt.getopt(sys.argv[1:], "hd:f:")
    except getopt.GetoptError as err:
        print "Error:", str(err)
        usage_and_exit()

    directory = None
    db_filename = "locate.db"

    for o, a in opts:
        if o == '-h':
            usage_and_exit()
        elif o == '-d':
            directory = a
        elif o == '-f':
            db_filename = a

    sdb = SearchDatabase()

    if directory: # Create Mode
        if not os.path.isdir(directory):
            print "Error: invalid directory"
            usage_and_exit()
        sdb.create(directory)
        sdb.save(db_filename)

    else: # Search Mode
        if len(args) == 0:
            print "Error: please specify a search word"
            usage_and_exit()
        sdb.load(db_filename)
        for arg in args:
            sdb.locate_and_display(arg)
          


Topic: Scripts and Code, by Kjetil @ 01/06-2014, Article Link

Recursive String Search Improved

Remember Recursive String Search?
I have made an improved version based on some experience with that program. First of all, this new one will search in the current directory by default, unless the '-d' option is used. Second, there is a simple filter that can be applied for filename extensions, which is helpful for code search. Finally, it's possible to print the filename on each line found. This is very convenient if the search results are to be "grepped" further.

Once again, there is a binary for Win32 (compiled with MinGW) and here's the source code:

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <limits.h>
#include <unistd.h>
#include <dirent.h>

#define FILTER_DELIMITER ";"
#define FILTER_MAX 10

static char *filters[FILTER_MAX];
static int no_of_filters = 0;

static void display_help(char *progname)
{
  fprintf(stderr, "Usage: %s <options> <string>\n", progname);
  fprintf(stderr, "Options:\n"
     "  -h          Display this help and exit.\n"
     "  -n          Print filename on each search match line.\n"
     "  -d DIR      Search DIR instead of current directory.\n"
     "  -f FILTER   Apply FILTER on filename extension when searching.\n"
     "                Delimited by ';', e.g. '.c;.cpp;.h;.hpp'\n"
     "\n");
}

static void build_filter(char *filter)
{
  no_of_filters = 0;

  filters[0] = strtok(filter, FILTER_DELIMITER);
  if (filters[0] == NULL)
    return;

  for (no_of_filters = 1; no_of_filters < FILTER_MAX; no_of_filters++) {
    filters[no_of_filters] = strtok(NULL, FILTER_DELIMITER);
    if (filters[no_of_filters] == NULL)
      break;
  }
}

static int matches_filter(char *name, int name_len)
{
  int i, n1, n2, match, filter_len;

  if (no_of_filters == 0)
    return 1; /* No filters, always matches. */

  for (i = 0; i < no_of_filters; i++) {
    filter_len = strlen(filters[i]);
    if (filter_len > name_len)
      return 0; /* Filter cannot be longer than name! */

    match = 0;
    n2 = name_len - 1;
    for (n1 = filter_len - 1; n1 >= 0; n1--, n2--) {
      if (toupper(filters[i][n1]) != toupper(name[n2]))
        break;
      match++;
    }

    if (filter_len == match)
      return 1; /* Whole filter matched. */
  }

  return 0; /* No matches. */
}

static void recurse(char *path, char *string, int string_len, int show_mode)
{
  static char line[1024]; /* Allocate in BSS. */
  char full_path[PATH_MAX];
  int n, match, name_shown, line_no;
  struct dirent *entry;
  struct stat st;
  DIR *dh;
  FILE *fh;

  dh = opendir(path);
  if (dh == NULL) {
    fprintf(stderr, "Warning: Unable to open directory: %s\n", path);
    return;
  }

  while ((entry = readdir(dh))) {
    if (entry->d_name[0] == '.')
      continue; /* Ignore files with leading dot. */

#ifdef WINNT
    snprintf(full_path, PATH_MAX, "%s\\%s", path, entry->d_name);
#else
    snprintf(full_path, PATH_MAX, "%s/%s", path, entry->d_name);
#endif

    stat(full_path, &st);
    if (S_ISDIR(st.st_mode)) {
      /* Traverse. */
      recurse(full_path, string, string_len, show_mode);

    } else if (S_ISREG(st.st_mode)) {
      /* Search. */
      if (! matches_filter(full_path, strlen(full_path)))
        continue;

      fh = fopen(full_path, "r");
      if (fh == NULL) {
        fprintf(stderr, "Warning: Unable to open file: %s\n", full_path);
        continue;
      }

      name_shown = line_no = 0;
      while (fgets(line, 1024, fh) != NULL) {
        line_no++;
        match = 0;
        for (n = 0; line[n] != '\0'; n++) {
          if (toupper(line[n]) == toupper(string[match])) {
            match++;
            if (match >= string_len) {
              if (show_mode == 0) {
                if (! name_shown) {
                  printf("%s\n", full_path);
                  name_shown = 1;
                }
                printf("%d:%s", line_no, line);
              } else {
                printf("%s:%d:%s", full_path, line_no, line);
              }
              break;
            }
          } else {
            match = 0;
          }
        }
      }
      fclose(fh);
    }
  }

  closedir(dh);
  return;
}

int main(int argc, char *argv[])
{
  int c;
  int show_mode = 0;
  char *search_dir = NULL;

  while ((c = getopt(argc, argv, "hnd:f:")) != -1) {
    switch (c) {
    case 'h':
      display_help(argv[0]);
      exit(EXIT_SUCCESS);

    case 'n':
      show_mode = 1;
      break;

    case 'd':
      search_dir = optarg;
      break;

    case 'f':
      build_filter(optarg);
      break;
    
    case '?':
    default:
      display_help(argv[0]);
      exit(EXIT_FAILURE);
    }
  }

  if (search_dir == NULL) {
    search_dir = "."; /* Current directory. */
  }

  if (optind >= argc) {
    display_help(argv[0]);
    return EXIT_FAILURE;
  }

  recurse(search_dir, argv[optind], strlen(argv[optind]), show_mode);
  return EXIT_SUCCESS;
}
          


Topic: Scripts and Code, by Kjetil @ 05/05-2014, Article Link

True Filename Listing

Here is a special tool I made to view the true names of files. A filename with characters other than ASCII (or ISO-8859) just shows up as question marks if you use the regular "ls" command. This is an alternative command that displays the filenames using a hex encoding, if the output is a TTY (terminal). If the output is redirected to a file, the true character values will be dumped.

Here is the source code, enjoy:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <unistd.h>
#include <ctype.h>
#include <string.h>

static void list(char *dirname)
{
  DIR *dir;
  struct dirent *file;
  int i, len, longest;

  dir = opendir(dirname);
  if (dir == NULL)
    return;

  longest = 0;
  while ((file = readdir(dir)) != NULL) {
    len = strlen(file->d_name);
    if (len > longest)
      longest = len;
  }

  rewinddir(dir);

  while ((file = readdir(dir)) != NULL) {
    if (isatty(STDOUT_FILENO)) {
      /* Printable name */
      for (i = 0; file->d_name[i] != '\0'; i++) {
        if (isprint(file->d_name[i])) {
          printf("%c", file->d_name[i]);
        } else {
          printf("?");
        }
      }

      /* Padding */
      while (i < longest) {
        printf(" ");
        i++;
      }
      printf("   ");

      /* Hex name */
      for (i = 0; file->d_name[i] != '\0'; i++) {
        printf("%02x", (unsigned char)file->d_name[i]);
      }
      printf("\n");

    } else {
      /* Raw output */
      for (i = 0; file->d_name[i] != '\0'; i++) {
        printf("%c", file->d_name[i]);
      }
      printf("\n");

    }
  }

  closedir(dir);
}

int main(int argc, char *argv[])
{
  int i;
  struct stat st;

  if (argc > 1) {
    for (i = 1; i < argc; i++) {
      if (stat(argv[i], &st) == 0) {
        if (S_ISDIR(st.st_mode)) {
          if (i > 1)
            printf("\n");
          printf("%s:\n", argv[i]);
          list(argv[i]);
        }
      }
    }
  } else {
    list(".");
  }

  return 0;
}
          


Topic: Scripts and Code, by Kjetil @ 02/04-2014, Article Link

Cue Sheet Splitter

Here is a Perl script that reads a cue sheet file and attempts to use that to split an associated WAV file.
It relies on both the "sox" (Sound eXchange, the Swiss Army knife of audio manipulation) and "flac" (Free Lossless Audio Codec) external tools.

I have used it successfully on several sets, take a look:

#!/usr/bin/perl -w
use strict;
# FLAC to WAV: flac -d *.flac

my $src = shift @ARGV or die "Usage: $0 <big wav file>\n";

undef $/;

my $last_track = 0;
my $last_title = 0;
my $last_artist = 0;
my $last_ts_min = 0;
my $last_ts_sec = 0;
my $last_ts_msec = 0;

while (<>) {
  while (m/TRACK (\d+) AUDIO.*?TITLE "([^"]*)".*?PERFORMER "([^"]*)".*?INDEX \d+ (\d+):(\d+):(\d+)/gs) {
    my $track = $1;
    my $title = $2;
    my $artist = $3;
    my $ts_min = $4;
    my $ts_sec = $5;
    my $ts_msec = $6;

    my $last_msec = ($last_ts_min * 60 * 100) + ($last_ts_sec * 100) + $last_ts_msec;
    my $msec = ($ts_min * 60 * 100) + ($ts_sec * 100) + $ts_msec;
    my $diff_total = $msec - $last_msec;
    my $duration = $diff_total / 100;

    if ($last_track > 0) {
      system("sox $src $last_track.wav trim $last_ts_min:$last_ts_sec.$last_ts_msec $duration");
      system("flac --best $last_track.wav");
      system("mv $last_track.flac \"$last_track - $last_artist - $last_title.flac\"");
    }

    $last_track = $track;
    $last_title = $title;
    $last_artist = $artist;
    $last_ts_min = $ts_min;
    $last_ts_sec = $ts_sec;
    $last_ts_msec = $ts_msec;
  }
}

system("sox $src $last_track.wav trim $last_ts_min:$last_ts_sec.$last_ts_msec");
system("flac --best $last_track.wav");
system("mv $last_track.flac \"$last_track - $last_artist - $last_title.flac\"");
          


Topic: Scripts and Code, by Kjetil @ 01/03-2014, Article Link

HTML Calendar Generator

There is probably a ton of these already, but here is my contribution, based on my personal preferences. I prefer to have week numbers on my calendars, and that the week starts on Mondays. The output is in a 4x3 month format, suitable for printing on a A4 sized landscape oriented paper.

Here is the Python script:

#!/usr/bin/python

import calendar
import datetime
import sys

if len(sys.argv) > 1:
    try:
        year = int(sys.argv[1])
    except ValueError:
        sys.exit(1)
else:
    year = datetime.date.today().year

print '''<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
  <title>%s</title>
  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
  <style type="text/css">
    table.year { 
      background-color: white;
    }
    table.month {
      background-color: gray;
      border: 5px solid white
    }
    td.year-name {
      background-color: white;
      text-align: center;
      font-weight: bold;
    }
    td.month-name {
      background-color: white;
      text-align: center;
      font-weight: bold;
      height: 20px;
    }
    td.wday {
      background-color: lightgray;
      color: black;
      text-align: center;
      width: 30px;
      height: 20px;
    }
    td.day {
      background-color: white;
      color: black;
      text-align: center;
      width: 30px;
      height: 20px;
    }
    td.week {
      background-color: lightgray;
      color: black;
      text-align: center;
      font-weight: bold;
      width: 25px;
      height: 20px;
    }
    h1 {
      text-align: center;
    }
  </style>
</head>
<body>
<table class="year">
<tr><td class="year-name" colspan="4"><h1>%s</h1></td></tr>
<tr>
<td>''' % (year, year)

for month in range(1, 13):
    skip_days, last_day = calendar.monthrange(year, month)
    day = 1

    print '<table class="month">'
    print '<tr><td class="month-name" colspan="8">%s</td></tr>''' % (calendar.month_name[month])

    print '<tr>'
    print '<td></td>'
    for wday in range(0, 7):
        print '<td class="wday">%s</td>' % (calendar.day_abbr[wday])
    print '</tr>'

    for row in range(6):
        print '<tr>'
        try:
            week = datetime.date(year, month, day).isocalendar()[1]
            print '<td class="week">%d</td>' % (week)
        except ValueError:
            print '<td class="week"></td>'

        for col in range(7):
            if skip_days:
                print '<td class="day"></td>'
                skip_days -= 1
            else:
                if day > last_day:
                    print '<td class="day"></td>'
                else:
                    print '<td class="day">%d</td>' % (day)
                    day += 1

        print '</tr>'

    print '</table>'
    if month % 4 == 0:
        print '</td>'
        print '</tr>'
        if month != 12:
            print '<tr>'
            print '<td>'
    else:
        print '</td>'
        print '<td>'

print '</table>'
print '</body>'
print '</html>'
          


Here is a calendar generated for 2014.

Topic: Scripts and Code, by Kjetil @ 01/02-2014, Article Link

Static PDCurses with MinGW

Statically linking a curses program on Windows is not as easy as it sounds. The main problem is that the PDCurses distributions that you'll find on sourceforge.net only contains the dynamic (DLL) library, not the static library. To get the static library, the only option seems to be to build PDCurses yourself from source.

I will try to explain. Before starting, make sure that you have installed MinGW and remembered to include the "make" program as well. Then proceed to download the PDCurses source and unpack it. Open a Windows command window and build it like this:

PATH=C:\MinGW\bin;C:\MinGW\msys\1.0\bin
cd PDCurses-3.4\win32\
make -f mingwin32.mak
          

(I had to set the PATH manually, or else I kept getting problems with a missing reference to libgmp-10.dll.)

The quick and dirty way to now link your curses program is to just copy the resulting pdcurses.a file (from the win32 directory) and the curses.h file (from the root PDCurses directory) to where you got your source and compile like this:

gcc -o program program.c -I. pdcurses.a
          


You can verify that the program is indeed statically linked by using objdump like this:

objdump -p program.exe
          

There should NOT be any reference to a PDCurses.dll file. If there is, well, then the program is still dynamically linked.

Sure a statically linked program is much larger, but it's much easier to distribute, since I doubt many people have the PDCurses.dll file on their Windows computers.

Topic: Configuration, by Kjetil @ 01/01-2014, Article Link