Saturday, November 12, 2016

Finding a Path 2016.11.12

I haven't blogged here in a long time. Here's some info about a project I've been working on.

Recently I've been toying with Google's Tensorflow python library. I've written an op for the cpu and the gpu. The op, of course, is a modified version of the Dijkstra algorithm. The funny thing is that the cpu version works faster than the gpu version. There seems to be a long overhead time, something like 7 seconds, before a gpu op begins to work. There is also a size restriction. My op works on grids. On the cpu version I can use a large grid. I've tried 480x480. On the gpu the op fails if the grid is larger than 70x70. What do I come away with? Well, I enjoyed doing the gpu version, but the cpu version ends up being superior.

So, the code is here:

I also use pygame to construct my gui for testing with real png images.

Thursday, July 28, 2016

awesome-audio-cnn 'brain' file

In the last post I mentioned that I was making public a github project that I've been working on. The project is called 'awesome-audio-cnn' and it uses a neural network to select music for automatic playback on a simple server. The project '' file suggests that you train your own neural network file for the server. On reflection I thought this was too difficult.

I have included in the git repository (mentioned in my last post) a file for operating the neural network that is at the heart of the awesome-audio-cnn project.

The file is 23.9M in size, so it will take a long time to download. What this means is that you don't need to train the neural network yourself. It allows you to get a faster start up. I should note that the file was developed at my location, on my audio collection. It may not act as you expect on your files. There is no provision, incidentally, for comparisons on the basis of tempo. The songs you get out of this neural net will not be the same with respect to tempo.

The inclusion of this file may mean that some with slow internet connection or limited disk space cannot use this project.

That said, someone with time on their hands might enjoy setting up this kind of server. The info in the file is provided on an 'as is' basis, without warranties or conditions of any kind.

The file is called 'fp-test.bin' and it is located in a folder called 'acnn-brain', for lack of a better title.

Instructions for awesome-audio-cnn

I recently made public a github repository I was working on called 'awesome-audio-cnn'. The code for the project is at github at . Below are the instructions that go with the project.


This is a complex project. There are several steps required for a full implementation of this project. Some of the steps require resources not available to everyone. The ultimate goal is the implementation of a music server that employs a neural network to help it select which songs to play. It might be possible to implement only part of the project and render for the user a server that plays music, but without the neural network, to play selected albums in their entirety. This latter option has not been explored in this document. Finally, this project pre-supposes that the user is serving up their own music in their own house. The project is not for any sort of distribution of music, and is scaled to operate with a small library of personal selections.
  1. Arrange your music in the predetermined format. Music for this project must be in the mp3 format and must be arranged in the following manner. All songs are sorted by album and are stored in folders with the artist and album title as the folder's name. These album folders are all stored in a single larger folder, usually called 'Music' or 'music'. There is generally a jpg image in each folder for the album's cover image. This file is called 'cover.jpg'. Song files that are simply dropped in the 'Music' folder without an album directory will not be recognized. Furthermore all mp3 songs should be tagged using the id3 tag system. Good tools for using mp3 files are: picard, soundjuicer, and soundconverter.
  2. Download the sources for the server project. The project is currenly hosted on github and is called awesome-audio-cnn. Make a folder in your work area (or 'workspace' in eclipse terminology) for your neural network training data. You will use this folder later. For example, mine is at ~/workspace/ACNN/.
  3. Install the necessary supporting software. This may include, but may not be limited to, java8, libchromaprint-tools, tomcat8, activemq, and maven. For an IDE the developer used the Intellij IDEA Community Eddition. The IDE is used for syntax verification, and the mvn command line tools are actually used for building the project.
  4. Build the server and the desktop versions of the project. You can use the command ./ /path/to/Music to set up the xml files in the project repository. You can also mount the music directory at the location /mnt/acnn/Music/ and leave the shell script unused. This mounting can be accomplished by modifying your fstab file. Use the command ./ to see if your build environment is up to date. If the second script works you will have two files in your awesome-audio-cnn folder, one, the war file, called audio.war, and the other one called acnn-desktop.jar, the jar file.
  5. Use the desktop version to setup the working environment for training the neural network. You should start the desktop version with the command java -jar acnn-desktop.jar -train. The various peices of information are stored in the user's home folder in a folder named .acnn. You need several thousand songs for this training to go well. Identify the location of the music folder to the desktop version of the software. You must also identify the folder that will hold the training data that you work with. On my computer this folder is called ~/workspace/ACNN/. It could be called anything and could be located anywhere in the user's home area. If at any time you want to start over with the training, one of the things you should do is to erase the contents of the folder ~/.acnn.
  6. Use the desktop version to create the training 'csv' file. Originally the file is called 'myfile_1.csv' but you can change the name by clicking buttons on the desktop user interface. After you have set up all the directories and the filenames you should click the Make List button. This makes the csv list of your songs that is used in neural network training.
  7. Begin training your neural network. Training is an iterative process and requires the /usr/bin/fpcalc program installed with libchromaprint-tools. You press the buttons on the interface in a certain order and you watch the terminal that you started the interface in. Basically you press the Train button and watch the screen. When a certian time has expired you press the Clear-Break button. Wait for the neural network model software to save the model. Then press the Test button to evaluate your progress. Wait while the testing software goes through the test set. At the end of the test phase you end up with a score. The score starts out at something like 0.5 but will improve with extensive periods of training. Repeat this process (#7) until you get a testing score between '0.85' and '0.95'. I stopped at a '0.94' score. Two files are created in this process. They are named (originally) fp-test.bin and fp-test.updater.bin. The base of the name ('fp-test') can be changed to anything you like during training, but must have the name 'fp-test' when the web site is launched.
  8. Prepare to deploy the war file. This is one of the areas where the requirements of the project are very specific. The audio.war file is meant to reside on a tomcat8 server that is connected by a dedicated IP address to a wifi router. The server is connected by cat cable. This way anyone in the area of the router can access the server via the IP address and play the music stored there. The server is meant for private use. The war file expects to find your two neural network files in the folder /opt/acnn/. Here it will setup another folder /opt/acnn/.acnn/ which is in most ways identical to the one that the desktop jar file creates in your home folder. The /opt/acnn/ folder should contain the two neural network '.bin' files and also the file myfile.id3tag.csv. If this csv file is not present the program will try to create it. Maintaining this file is covered in the next step.
  9. Whenever you change the contents of your music file, and when you first setup your server, you must supply the audio.war file with a new copy of the myfile.id3tag.csv file. Start the desktop version of the program with the command java -jar acnn-desktop.jar -id3. After a few minutes the program will exit, leaving a new copy of the file in the training data folder. On my computer this would be in the ~/workspace/ACNN/ folder. Copy this file to the server computer and put it (with the permissions that will allow it to be read universally) in the /opt/acnn/ folder. There should be one entry in this csv file for every mp3 file that the server has access to.
  10. Go through the document and make sure that you have edited all the tomcat and activemq configuration files to allow the server to do its job. For activemq type the following.
    $ cd /etc/activemq/instances-enabled
    $ sudo ln -s ../instances-available/main .
    For the tomcat8 server you must specify memory sizes for startup of the catatlina engine. Add the following line to the beginning of the file at the location /usr/share/tomcat8/bin/.
    export CATALINA_OPTS="$CATALINA_OPTS -Xms1024m -Xmx4g"
    For the tomcat8 admin interface, change the following to allow larger files to be processed. The web manager upload size must be changed. /user/share/tomcat8-admin/manager/WEB-INF/web.xml needs to have the following code:
    <!-- 52MB max -->
    I added a zero to both 'max' numbers mentioned above.
  11. Set up the mysql database. To do this you should have root privileges. Type mysql -u root -p. When prompted enter the root password. Then type this:
    mysql> create database acnn;
    mysql> grant all on acnn.* to 'testuser'@'localhost' identified by 'fpstuff';
  12. Stop the tomcat8 webserver on the server computer. This can be achieved by typing sudo /etc/init.d/tomcat8 stop. Copy the audio.war file to the directory /var/lib/tomcat8/webapps/. Restart the server with the command sudo /etc/init.d/tomcat8 restart. Navigate to the server with your favorite browser on your favorite device and listen to your music.

Monday, May 2, 2016

Awesome Flyer

I returned to Awesome-Flyer to update the game. I added some stuff that connects the game to Google Leaderboards, and it seems to work well. I also corrected some screen size issues that were present if you played the game in portrait mode.

It all went reasonably well. The biggest part of the whole thing was switching from Eclipse, which I used when I started the game, to Android Studio, which is the current standard. This was all before I even started to address what was wrong with the game!

There's few if any real users of this game. Maybe I can change that in the future.

If you were a user of the game, and you update to the new version, you will have to clear the local game's cache for the new game to work. Otherwise the game will crash on you when you launch.

This is because I used 'integers' for saving the score in the old version of the game. The new version uses 'longs', and since I save the score in system preferences (application cache) and since these values changed from 'int' to 'long', the android operating system needs to be cleaned out before the new preferences can be saved over the old.

My only fear now is that the 'jni' library for the game may not work for all SDK levels. I have tried it on SDK 22, 23, and some others. I do not have many devices to test with, though, and I don't have money to buy test devices. Hopefully the native library (the 'jni') just works. If not, then only a small number of people will be able to use the game.

Monday, April 4, 2016

Awesome Tag - 2016.04.04

Awesome Tag

This is a java application for detecting faces in pictures using convolutional neural networks. It should be noted that the app doesn't work. Here is a link to the gihub project. . However facial detection is really done, it's not done like this. This will probably be the first and last post about this project.


The sources for the images was the NIST IJB-A dataset. The NIST pictures are part of a contest, but are free to download if you register with them. There are thousands of pictures, and several csv files with label data. In each picture faces are identified and this data can be used to train your program.

The software libraries used in the program are from a company called Skymind. They are nd4j, deeplearning4j, and Canova. I used the distributions 0.4-rc3.8 release extensively. My CNN model was ultimately constructed with 4 layers. The first layer was a convolutional layer. The second was a pooling layer. The third was a 'DenseLayer' and the last was a output or 'softmax' layer. I am not entirely sure that the DenseLayer supported the kind of learning I was looking for, so I experemented with other layer configurations. In the end the DenseLayer performed best so I kept it.

I used the linux operating system, java 8, and the IntelliJ IDE. Training for weights and biases for the neural network have to be re-learned every time the project is installed. This is because they are too big to save with the code. There are no weight and bias files online for this project.


You have to get a copy of the NIST IJB-A dataset. This is a 14G download. You also need to unpack the tar g-zip file, so you need something like 28G free. Then make sure InteliJ-IDE is installed. Download the 'awesome-tag' java repository from and open the project using the IDE. You should launch the GUI launcher, the file called You can also try launching the `` file. You may be able to launch the file this way without the IDE.

When opening the GUI for the first time you have to set the values for the program's operation. For example, you need to tell it where to find the IJB-A images and csv files. When this is done you can select a picture to view. The picture will probably be in the csv database, and the csv database is loaded automatically when the GUI is started, so if you want to you can click the 'Add Lines' button and see the boxes that identify the faces in the picture. This is the basic information that is provided by the IJB-A dataset.

At this stage you also identify what 'split' you want to start with and what 'split' you want to end with. Because there are so many images it is sufficient to choose 'split' 1 to start and also 'split' 1 to end. Setting these initial values creates a file in the user's home folder called `.atag`. This folder holds all the info that the GUI uses in its regular operation.

The next step is to build your own csv file. This file would contain the boxes mentioned above and also empty boxes without faces in them. These empty boxes are used by the program to train the cnn. For every box with a face there is another box with no face of the same size. You create this second csv file by clicking the 'Mod And Save' button. This process can take an hour or more just for one 'split'. If you have chosen different splits for start and finish the program will try to cycle through the set you have chosen and will process all of them. There are ten splits in the IJB-A dataset, and trying to process all ten could take a long time.

After setting up your csv file you can start training. Click the 'Reset Cursor' button to make sure that your input data cursor is starting with a zero value. Bye the way, yuo can also choose to erase all weights and biases when you click the 'Reset Cursor' button. Click the 'Train CNN' button once. This will start training. Training is very time consuming, so a mechanism is provided for saving your weights and biases, so that you can stop training and restart the process where you left off.

Loading and saving the neural network takes from five to ten minutes. When you click the 'Train CNN' button once the program will attempt to load the model. Then it will automatically start the training process. If you click the button again the neural network will stop training and will start the save process. You will see a dialog box on screen with a cycling progress bar until the save process has stopped. Again, this takes five to ten minutes. After it has stopped you can close the app or move on to testing or prediction. It takes several hours of training to get any results at all. The idea here is to let you do that training in smaller segments. You can force the program to stop during the save operation, but you are likely to corrupt your weight and bias files and thereby loose any learning that you have completed up to that point.

Clicking the 'Test CNN' button will allow you to test the accuracy of you CNN against your own csv file data. Clicking the 'Predict' button will attempt to find faces in the currently selected image. Both these operations take time. Though you may be able to train your network to the level of 75 or 80 percent, the process employed by the 'Predict' mechanism does not identify images properly. This is the reason that the project will not be developed further.


The current 'Prediction' scheme is a two stage operation. First you click the "Prediction" button and choose the "EVEN" option from the two options presented. Then you wait for the first stage, the "EVEN" stage to complete. When it's done you see the image on the screen with some superimposed boxes on it. This is stage 1. Then click the "Prediction" button again and choose the "IMPROVE" option. Again you must wait a long time. After this is done you see the final result on the screen.

The "EVEN" operation devides the picture into evenly spaced boxes and runs them through the neural network. When this is done you get separate "areas of interest". These areas of interest are the boxes you see superimposed on the screen after the first "Prediction" phase.

The "IMPROVE" operation takes the areas of interest and searches for a box size that most clearly looks like a face. It takes each area-of-interest box seperately and in that location tries out eight boxes of different sizes to see which produces the best score.

Wednesday, February 17, 2016

awesome-cnn-android ABOUT

This text was taken from the 'ABOUT' file in the awesome-cnn-android github project. The complete project can be found here:

This program uses convolutional neural networks to try to figure out what letter you are drawing on the screen and feed it to a android program as input. The neural networks used are of the 'LeNet' type, pioneered by Yann LeCun.

Implementation of neural networks in this project use Skymind Ink's 'deeplearning4j' package, which is licensed under the Apache 2 license. Most Skymind code can be found in the files '' and '' in the folder 'app/src/main/java/org/davidliebman/android/ime/'.

Tuesday, February 16, 2016

awesome-cnn-android INSTRUCTIONS

I have been working on a new project. It uses a convolutional neural network in java. It runs on the android phone and allows you to draw letters and numbers and output them to the text area that you are using. I will post more about it shortly. Right now it is a repository on github. You cannot get it on the Play store.

See this link for the actual project:
  1.  Load the app into your device.
  2. Wait for the text view to change from "LOADING" to a blank area. There should also be a progress spinner. Wait until it disappears. When this has happened the network biases have loaded. This can take five minutes.
  3. Draw a character on the screen. Use the gray square at the bottom of the screen to draw your character.
  4. Press the button marked "ENTER". The CNN should try to figure out the letter you drew.
  5. You can select from the categories 'UPPER-case', 'LOWER-case', and 'NUM-bers' by pressing the toggle button above the gray character input area. In 'UPPER' mode only upper case characters will be correctly identified at the input. Similarly in 'LOWER' mode only lower case characters will be detected.
  6. You can also choose 'WRITE' or 'ERASE' from another toggle button above the gray input area. This allows you to edit the character you are drawing with more precision.
  7. Special buttons have been provided for 'Backspace' (labeled 'BACK') and 'Carriage-Return' (labeled 'GO'). There are also buttons for navigation that will allow you to move forward and back in the text area. These are labeled with arrows.
  8. If you cannot get the neural network to recognise your drawings, there is a set of drop-down menus that will allow you to pick any character that is displayed on the normal keyboard and send it to the text-area. There are four of these dropdowns, for (a) numbers, (b) symbols, (c) upper-case letters, and (d) lower-case letters. They are located just to the right and left of the input area at the bottom of the screen. To use them, touch the drop-down menu and select the character you want. When you do this that character will appear on the label of the 'ENTER' key. Press the enter key at this time to send the selected character to the text area.