Your task is to build a system -- a classifier -- that is able to predict the gender of a novel name based on this training data. You are free to use any technique you want. Presumably the features will be based on the characters, and some characters will likely be more relevant than others, but you should make no a priori assumptions about what features are relevant. Instead, I strongly advise you to divide this training set into a training and test set, and try out various ideas and see which one works best on your test data.
On April 30 at 7:00 PM. I will release the test data. You will run your classifier on the data.
You will turn in a tar file in the usual form, containing two files:
data.tst your predictions readme a file that explains how your classifier works
Note that I want data.tst to be exactly in the same format as the training data, with the lines in the same order as I gave them to you. If you are not sure what "exactly in the same format" means, then you should ask.