Posts

Showing posts from 2012

Basic word frequency analysis

Image
Here are some interesting terms in the Democratic presidential debate from 2008: I believe we're at a defining moment in our history. Our nation is at war; our planet is in peril.... ------------------------------- Total Count of most terms : 9125 Interesting Word Freq Count: 1952 ------------------------------- id=1 ct=112(39.16%) term=think id=2 ct=101(35.31%) term=applause id=3 ct=97(33.92%) term=clinton id=4 ct=97(33.92%) term=people id=5 ct=85(29.72%) term=senator id=6 ct=66(23.08%) term=health id=7 ct=62(21.68%) term=obama id=8 ct=56(19.58%) term=care id=9 ct=56(19.58%) term=blitzer id=10 ct=47(16.43%) term=right id=11 ct=44(15.38%) term=president id=12 ct=40(13.99%) term=country id=13 ct=35(12.24%) term=make id=14 ct=34(11.89%) term=plan id=15 ct=32(11.19%) term=question id=16 ct=30(10.49%) term=believe id=17 ct=30(10.49%) term=important id=18 ct=28(9.79%) term=issue id=19 ct=28(9.79%) term=take id=20 ct=27(9.44%) term=time id=21 ct=26(9.09%) ...

Just some code

Image
Just some code
jdistprop Distributed Property Files With Java This project supports distributed property files for easy loading with Java projects. This project is a solution to this problem: In a J2EE environment, we are normally used to storing text in a property/resource file. firstName=First Name someOtherData=This is the data to display on screen, from property file If you are in an environment, where it is difficult to update those property files on a regular basis, what architecture are developers using to change text/label content that would normally reside in a property file? Or let's say you need to change that content before re-deploying a property file change. One solution is to store that in a database? Are developers using memcache-db? Is that usually used for caching solutions? Would you use a solution outside of the java framework? Like a key/value datastore? memcachedb? http://code.google.com/p/jdistprop/

Build a java virtual machine that is actually readable/modifiable for Win32

Is it possible to build a Java virtual machine for Win32 in an sort of understandable way?  It is mostly impossible.  The OpenJDK build will take hours just to build and will take a day to prepare your environment.   If you aren't a core JVM developer, who actually is going to take several days to prep their environment so that they can hack OpenJDK.  The jikes RVM may work with cygwin/win32 but it is mostly designed for linux or some other open platform.  You are really only left with JamVM.  I was actually able to install all of the dependencies with cygwin and perform a build in 15 minutes.  And then actually edit the C source, add a log statement, rebuild and run against a bytecode class file.  JamVM is the only JVM project that is understandable (20-30 core C files) and the build actually works with a modern version of cygwin. I will go through some of the setup.  It helps to actually install JamVM through cygwin.  Full install. ...

Code snippet of the day: Haskell for the dumb idiot lazy programmers

Image
The euler project problem reads as such, "If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000." Here is one of many implementations in Haskell.  I used a verbose recursive approach, iterate up to 1000 and then build a list with the items of interest.  In this case, 'multiples of 3 or 5'.  The first implementation contains a logging utility for writing a string at each iteration. Figure 1: Euler Problem1 in Haskell Here is the second source snippet, I just wanted to provide something more practical, a log parsing example that you can run against your web log files. Figure 2: Applied Haskell, simply read each line of a file, find a term and output the results to another file. Source: https://javanotebook.googlecode.com/svn/trunk/math/MathServices/docs/haskell

Image and Code of the Day

Image
Fig 1: Graph of a 'worse case' number of operations, n^2. Fig 2: Quick Sort Number of Operations vs {n * log(n)} graph.   Quick Sort average case is {n * log(n)}.  During the quick sort simulation, the number of reported operations aligned with the average case {n*log(n)}. Fig3: R Source for plot of  line graph.  Data consists of rows of tab delimited integer values. Fig 4: Quick Sort Source in Java

New Anime Reviews by Berlin Brown

Anime is Japanese animation. Animation that is not necessarily targeted for kids. In fact, some Animation can be pretty rough and make a 40 year old squirm. Basically, Anime has characters and culture, stories, plots. If you like interesting stories, then you will like most of the popular Anime. What is the difference between Manga and Anime? I am American and usually just refer to anything that moves and is from Japan as "Anime" or Japanimation. But that isn't entirely accurate. There are some Anime TV series/movies that were Manga comic books in Japan and then made into movies. Manga to Anime. Manga is Comic Book in Japanese. If someone says, "Fullmetal Alchemist is a great Manga". I guess they could refer to the comic book version of Fullmetal or the TV adaptation. But normally they are referring to the comic book. Anime can be an animated series derived from a Manga. Popular Adult Stuff There are many genres of Anime. I like the adult themed stuff. Ho...

House MD

House MD is a good TV show -- berlin brown http://math-services.appspot.com/

Random Technology Links

Here are random technology links: [Jan 2012] http://nlpwp.org/book/chap-words.xhtml http://www.stanford.edu/class/cs242/readings/backus.pdf http://stackoverflow.com/questions/1962278/dealing-with-timestamps-in-r http://math.illinoisstate.edu/dhkim/rstuff/rtutor.html http://www.harding.edu/fmccown/r/ http://www.rforge.net/rJava/ http://www.rforge.net/rJava/ http://www.ugo.com/games/strategy-games-top-50?page=5 http://www.cs.uiuc.edu/~jeffe/teaching/algorithms/ http://www.nlp-class.org/ http://www.security-class.org/ COLT: http://acs.lbl.gov/~hoschek/colt/ http://weka.sourceforge.net/doc.dev/weka/classifiers/bayes/BayesNet.html http://apache-wicket.1842946.n4.nabble.com/Wicket-Session-grows-too-big-real-fast-td1875816.html http://www.keysbotzum.com (public) http://www.cell-auto.com/ http://bayfiles.com/ http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/ http://linuxproblem.org/art_9.html http://www.mitpressjournals.org/doi/ab...

I am Zit Pomney and I am running for President

I am Zit Pomney and I am running for President.  I have a vision for America.  I want to bring jobs to this country.  I know I am the front runner but I am a tough boy, I can take the attacks from the other candidates.  I will get America back to work.  I will work hard for America.  America is a great place with great people.  And we need to balance our budget "I love how the current candidates don't actually say anything of substance 99.9% of the time.  So I have created my fictional characters, meet Zit Pomney"

Thirty second review of top modern TV shows

This is a short review of about a dozen top TV shows that have appeared in the last decade or so.   I advise you purchase the DVD TV sets and watch them based on your own schedule. These aren't sorted in any particular order but the better shows will normally appear at the top. The Wire (10/10) - (Urban gang land violence) So far the best TV show I have seen.  It was getting predictable and dry towards the end of the series but I don't think there is any better show that feels "real".  It is dark and gritty and highly watchable. Dexter (9.9/10) - (Almost comical look at a serial killer do-gooder) Dexter is the show about the serial killer that also happens to be a blood splatter analyst in the police department.  Normally he finds another bad guy or serial killer and does away with him.  It is also a dark series with lots of gore and sometimes it can be serious.  But I tend to find some of it comical.  The Wire is not comical at all, Dexter ca...

Python matplotlib plotting setup for cygwin

Matplotlib is a popular python library for generating plot graphics.   It works with cygwin win32 but some non-intuitive steps are required. Install cygwin: CYGWIN_NT-5.1 1.7.9(0.237/5/3) 2011-03-29 10:10 i686 Cygwin I am using cygwin with setup 2.7.3 Install python through cygwin: This document describes installing python and matplot with cygwin.  For most windows users and cygwin users, normally you would use the external python executable.  I tend to prefer all of my script oriented applications running through cygwin including python. Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01) [GCC 4.3.4 20090804 (release) 1] on cygwin In the cygwin setup.exe installer, install: The freetype libs, python-gtk, python-tk, libpng Download matplotlib from sourceforge: I am using the latest version matplotlib-1.1.0 as of 1/2012. Issues with standard python install and cygwin: The typical 'python setup.py install' will not work with cygwin.  You...

2012 Campaign, help from political science community

After this race is over, I wish a fair politic science group could look at the media's handling of this race. Here are some BIG missteps in how the media handled this race: Sarah Palin, why prop her up at all? Rick Perry enters the race and within two weeks, he is a front runner Herman Cain, once again, why prop him up? And then Newt's rise Look at the criticism against Ron Paul. Someone could just look at the timing and wording of discussion on the candidates. It is obvious they have their favorites. ... CNN and Fox didn't campaign for Ron Paul 24/7 like they did with other candidates. But they did have Ron Paul on their programs. Ron Paul had to defend his own platform. But CNN, Fox did the campaigning for the other candidates. And it is easy to spot, just look at the amount of coverage, wording and discussion.