project

Projects & Papers

Papers and Essays

  • CourtListener.com: A platform for researching and staying abreast of the latest in the law, Michael Lissner, 07 May 2010. [pdf]
  • Exploratory Analysis of Service Recipients of the Community Services Bureau, Michael Lissner, 27 February 2010. [html]
  • Breaking ReCAPTCHA, Michael Lissner, 9 December 2009. [pdf]
  • Proactive Methods for Secure Design, Michael Lissner, 9 December 2009. [pdf]
  • Facebook's Battle Sign, Michael Lissner, 16 November 2009. [pdf]
  • Wikipedia Article on Jacobsen v. Katzer, Michael Lissner, et al, 03 October 2009. [html]
  • The Difficulties of Managing Online Estates, Michael Lissner, 15 May 2009. [pdf]
  • Online Grieving by Default, Michael Lissner, 12 May 2009. [pdf]
  • The Layered FTC Approach to Online Behavioral Advertising, Michael Lissner, 02 April 2009. [pdf]
  • Technology Revolution and the Fourth Amendment, Michael Lissner, 22 May 2009. [pdf]
  • Wikipedia Article on Zeran v. AOL, Michael Lissner, et al, 18 March 2009. [html]
  • Sustainability Metrics for the Energy Sector, Michael Lissner, Hazel Onsrud, Sharmila Ravula, 10 December 2008. [pdf]
  • TuneRepublic Democratic Jukebox, Ryan Greenberg, Michael Lissner, Zain Syed, 07 January 2009. [pdf]

Programming Projects

  • Swimlane Diagram Generator, 06 September 2010, XSLT. [html]
  • Mercurial Hook to Automatically Copyright Pushed Files, 24 January 2010, Python. [html]
  • F-spot Photo Management Database Cleaner, 14 October 2009, bash, SQL. [html]
  • Yelp Scraper, 21 December 2008, Python. [html]
  • Twitter Credentials Verification Script, 03 April 2009, Python. [html]
  • Pacific Crest Trail Temperature Analysis Visualization, 12 December 2007, Java. [html]

Presentations

  • Interface Design Final Project, 07 May 2009. [ppt]
  • HTML Basics, 09 July 2010. [html]
  • Search, 28 July 2010. [html]
  • Browsers, 30 July 2010. [html]
  • Mechanize, Beautiful Soup and Regular Expressions, 11 April 2009. [pdf]

Websites

Final Project from Interface Aesthetics Class

I had my last day of actual class today, and the semester is really beginning to wind down. I still have at least 60 pages of writing to do in the next two weeks, but somehow it will get done. One class that I have finished my final project work for is Interface Aesthetics, which was a survey class covering typography, color, layout, web design, and a handful of other topics.

For the final project, we each made presentations of our work from the semester (mine's linked below), and on Monday from 4:10 to 6pm we will be holding an open house to share our work at the iSchool. A lot of it is really quite good, though this presentation is what I will be showing, so this is one cat that's out of the bag.

Some of the work in the attached could still use some refinement, but I will point you towards the ones titled "Type I," "Type II," and "Icons," which I think came out pretty well.

Location Based DNS Switching For Internet vs. Intranet

I realized over the weekend that since I run my own mail server out of my home, I can configure my computer to download my mail over the intranet whenever I am on my home network. By doing this, I can drastically reduce my mail download times because it cuts the Internet out of the equation. Rather than using DNS + the Internet to get my mail, I can download it directly from internal IP address of the server.

To understand how to set this up, you have to understand that whenever you use a domain name (like michaeljaylissner.com), your computer does an IP lookup. First, it looks in /etc/hosts to see if it knows the IP of the domain locally. If it does, it will use the IP listed there. If it does not, it will ask your Internet provider what IP to use, and will use that. Thus, what we want to do is set up the computer so that when we are at home, /etc/hosts provides the internal IP of our server, and so when we are not at home, it does not.

When I am at home, I am always on a wireless network called, "pizzapuppysantaclaus." Thus, by checking what wireless network I am connected to, I can check if I am at home, and make whatever changes are necessary. Conveniently, whenever you change network connections, you run all of the scripts located in /etc/network/if-up.d/. Thus, we will put a small script in there that checks what wireless network we are on, and then changes our /etc/hosts file if necessary.

To set up this configuration, I made three files. The first is the script mentioned above, which needs to be owned by root, and placed in /etc/network/if-up.d. You can name it whatever you want, and by changing "pizzapuppysantaclaus" to the name of your network, you can fit it to your needs. Here's the contents of the script:

#First, we check if we are connected to pizzapuppysantaclaus
#If grep has a hit, we're connected, and $? will equal 0, if not, $? will equal 1
iwconfig 2> /dev/null | grep pizzapuppysantaclaus > /dev/null
 
if [ $? = 0 ]
then
  #Switch the /etc/hosts file with the other one
  cp -f /etc/hostsIntranet /etc/hosts
 
  else
  #Switch the /etc/hosts file with the other one
  cp -f /etc/hostsInternet /etc/hosts
 
fi
 
exit 0

This script simply performs a check of our wireless ID. If it's pizzapuppysantaclaus, it switches /etc/hostsIntranet for /etc/hosts. If not, it switches /etc/hostsInternet for /etc/hosts.

The contents of /etc/hostsIntranet are:

192.168.1.132	michaeljaylissner.com
192.168.1.132	charityhikers.org
127.0.0.1	localhost
127.0.1.1	opal
 
# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

And /etc/hostsInternet is just a copy of /etc/hosts.

So, to make this whole thing run, put the script in /etc/network/if-up.d, and set its owner to root with execute permission. Create a file called /etc/hostsIntranet, that contains your intranet configuration, as shown above. Make a copy of your normal /etc/hosts file called /etc/hostsInternet.

Once all that's done, you should be all set. Any questions, please feel free to comment!

Working with matplotlib and pycairo

I spent a good part of my winter break working on learning Python and using it for projects. One project was the Yelp scraper that I posted about previously, and another was a report for my old work.

The report is a statistical analysis of the development of about 2,000 children aged three and four. For those interested, I'll try to post it here once the final version is ready to go. In the past when making the report, I had been frustrated because there was no easy way to script the creation of the 30 or so charts that need to be made. Excel had been our data analysis tool, and as such, we were stuck with either using VBA to create charts, or to do it by hand. Since nobody knew VBA, we always just buckled down and did the work by hand.

This time around, I discovered the matplotlib Python library, and used that to create the charts. It was an pretty rough experience all in all. While simple graphs can be created in about five lines of code, creating complicated ones took a good amount of work. For example, to change the tick markers on a graph requires that you create tick objects, and then manipulate them each individually in a for loop. Granted, I couldn't customize them at all in Excel, but figuring out that kind of change was a pain indeed.

The report itself required about 1,000 lines of code, and each chart required about 100-200 lines. For custom charts, I didn't find the library that useful, however towards the end of the report there are 30 charts, all of which are identical, except for the data. For these charts, I was able to make a for loop that created them all in about 20 minutes, whereas previously these took me a few hours to make by hand.

Another library I spent some time learning was the pycairo library, which allows pixel by pixel editing of pictures. I had planned to use it to do any editing to the charts that I was unable to accomplish with the matplotlib library, but in the end, it was unnecessary. I have another project coming up though that will use the pycairo library, so look for that soon.

Rsnapshot Backup Solution OR Why Backing Up Is Hard To Do

I've been working on getting this post figured out for about five months. In this post, I am going to try to explain exactly how my backup works, and why. It's ridiculously complicated at times, but the detail is necessary on paper in some form or other.

For my backup system, I rely heavily on rsnapshot, which is a tool that uses rsync and some perl scripting to create snapshots of directories. Rsync is a pretty awesome tool. It functions like a simple copy/paste, except that it will check the destination directory of the paste and will only copy the necessary files. As such, it can be interrupted in the middle of a copy, and will be able to continue later where it left off. Perl is a scripting language that has been used with rsync to give it some extra power.

This power is the ability to perform incremental backups, which is to say that if I have 5GB of data that I backup 10 days in a row, it will only take up about 5GB of data, total. However, if I have 5GB of data today, an additional 5GB tomorrow, and another 5GB the day after, which I backup each day for ten days, it will only require a total of 5GB of space the first day, 10GB the second day, 15GB the third day, and no more space after that for the remaining 7 days.

This is important if you want to backup your data on a regular basis. Since I run a server, I have several things that I must backup. I back these up on a daily, weekly, and monthly basis. The list includes:

  • My laptop is backed up wirelessly to the server's hard drive
  • The email server gets copied to an external USB drive (this includes all the Zimbra configuration files as well as thousands of emails)
  • The web server gets backed up to the USB drive (this includes the Drupal installation and the MySQL database)
  • Lots of configuration files for the servers go to the USB drive (i.e. the /etc/ directory)
  • And finally, the backup configuration itself goes to the USB drive

Each of these backups presents some challenging difficulties. For the web server, it is challenging because it is backing up MySQL, Zimbra and Drupal. In order to do this, I have to coordinate the MySQL database dump so that when the Drupal backup is triggered, it will copy the MySQL information over to the USB drive along with the normal Drupal information. For Zimbra, the email server has to be stopped, backed up, and then started again, which means control of the email server has to be carefully scripted.

The laptop presents a challenge because it is the only thing that is backed up wirelessly, and in order to do so, the server must authenticate itself to the laptop before it is allowed to log in and make the copies. If that wasn't complicated enough, in addition, the laptop needs to be set up with a static IP address so that the server can find it to perform the backup. Finally, the laptop needs to be ON, and connected to the network when the server attempts the backup.

Once all of that is figured out logically, you have to authenticate the laptop to the server, create the scripts, backup configurations and cron files. I have attached some of these configuration files to this post, provided they don't reveal too much of my network topology.

One final challenge that had to be overcome was connecting the USB drive to the server in such a way that it would always be mounted in the same location. In addition, I learned that FAT32 doesn't support file system links, and so I had to format the USB drive as ext3.

As of today, it's about five months since I began this project, and I believe I can say that the backup happens flawlessly on a daily, weekly and monthly basis. There are a few things I'd like to change however:

  • I'd like to get an email notification when a backup fails — Done - See comment below.
  • I'd like to begin backing up /etc/ on my laptop — Done
  • At one point, I was backing up a list of all the installed software on my system - it'd be nice to have that again — Done - I wrote a python script to do so
  • The backup is unencrypted, so anybody can take the USB drive and have a heck of a lot of emails. Gotta fix that. — See the note below in the comments for details.

Files of Interest

  1. rsnapshot configuration for my laptop
  2. rsnapshot configuration for the Drupal server
  3. rsnapshot configuration for the backup configuration files and the /etc directory
  4. rsnapshot configuration for the Zimbra server
  5. scripts to stop/start Zimbra
  6. My cron file

All in all, this just goes to show that backing up is a very difficult thing to do properly and automatically. It's one thing if you have a desktop that backs up to a USB drive. It's another if you have a server and a laptop. Had I known how long this would take going into it, I'm not sure I would have figured it all out. How the average computer user is supposed to figure this out is beyond me.

GIS Training

Tagged:  

My work sent me on a Geographic Information System (GIS) training today, and I must say it was a very educational experience. I was unfamiliar with GIS before the training, and after today, look what I can do:

The basic idea is to be able to take census tracts of land, combine them with population studies, and then to plot the information geographically (as above). The software of choice for places that need closed-source software (such as my work) seems to be ArcGIS, but there are several other open source versions that are free. The two that I've found so far are Grass and QuantumGIS (aka qgis).

From what I can tell, Grass is the original, old-skool GIS developed by the military, and Quantum is the up and coming GIS that wants to supplant it. I'll try to come back once I've tried them both, and let people know what I learn. In the mean time, I must say it's exciting to be able to make graphs like these.

Any requests for studies? I'm amped to test these out...

PCT Data Project - DONE

I'm happy to announce that the PCT data project is complete!

Over the past several weeks/months, I have been slaving away over my computer writing this program. When used, it will generate a dynamic graphing area that will load up temperature data for one to six PCT hikers.

All those that are interested in the most complicated programming assignment I have ever worked on are welcome to check it out at michaeljaylissner.com/pct-temperatures.

I am officially a free man once again! Thanks to all who made this possible with their encouragement and patience!

The Great Temperature Data Project

Back in '05 when I hiked from Mexico to Canada on the Pacific Crest Trail, I carried a little device called an iButton. This little device contains essentially three things: a clock, a bit of memory and a thermometer. It's waterproof, accurate to .1 degree Celsius, and is about the size of five dimes stacked one upon another. There are a bunch of silly things you can do with these, but what I chose to do with mine was to have it record the temperature every hour on the hour for the entire time I was hiking, with the idea being to get some good data about the temperature out there on the PCT.

All in all, you can figure that the temperature was recorded 24 times a day for about 150 days, for an astounding 3600 data points, and about 150 oscillations from the daytime high to the nighttime low. I've spent some time working with the data, and it's pretty much impossible to make much use of....unless you write a program to interpret it. You can see it for yourself if you're interested.

Well, as fate should have it, I am currently enrolled in a Java programming class, and I have the option of doing a final project of my own choosing. Having not put this data to good use has been a burden on my soul for a couple of years now, and I've decided to make my final project an applet that will allow a user to plot this data on a graph for any date range and any time range that they choose (e.g. 5pm to 10pm for September 20th to 23rd).

Once this is done, I will attempt to post it here, but here's the question to you dear reader, do you have any suggestions as to features that you would be interested in seeing in an applet of this sort? Thoughts?

I'm quite excited about getting this info out there. Finally.

Syndicate content