Data sets for Machine Learning

Grad school was a few years ago, and things are moving fast, so I am writing a bunch of small ML programs to get current with the state of the art again. Also switching form using C++ to Python for ML work. Wow, thats a huge improvement right there – can’t believe I waited this long.

The biggest problem in writing ML code is finding decent data sets. The UCI Machine Learning Repository has links to a bunch of curated data sets. Posting it here, like the rest of the ML stuff I’m about to, so its easy to find and point friends to.

Decision Tree Classifiers – A simple example

Here is a simple “Machine Learning” Python program using scikit-learn’s DecisionTree classifier to use height and weight to predict your body type. For the record – this is why people hate BMI and things like it. After writing this I think I need to go on a diet.

Identification Trees – often called decision trees – provide a way to deterministically map a bunch of qualitative observations into predictions. Basically the predictions are a set of observed output states, and we are looking for observable features, inputs, that we can use in a tree of tests.

Training builds the decision tree from two sets of data, our set of observations and a set of labels corresponding to each of the observations. Each node in the tree represents a test that cuts the training set with a number of cuts – the results of each of those cuts going on to either subsequent tests, or to a leaf node representing a specific output label or state.

The MIT open courseware video Identification Trees and Disorder is a good introduction.

So lets say we wanted to determine based on someone’s height and weight if they were overweight or not. To get some training data we could take bunch of random samples of a representative population of people – ask them their height and weight and then create labels for each person determining if they were of a normal weight, overweight, or obese. That would not be a fun data set to try and collect – so lets cheat.

The Body Mass Index or BMI is an equation already derived from population health data that roughly maps height and weight into a number, the BMI. The BMI can be used to predict if a person is under weight, normal weight, and over weight, or obese. The BMI equation is roughly BMI = [(weight in pounds * 703)/(height in inches squared)]. A BMI of less than 18.5 are underweight, BMIs between 19 and 25 reflect a normal weight, a BMI of 25-30 correspond to being overweight, and a BMI over 30 signals obesity. So BMI equations let us build a table mapping height and weight to a table that would be representative of uniform sampling of a large population.

So using BMI sampling data here is a simple Python program using sklearn’s DecisionTree classifier to tell you if you are obese, overweight, or normal weight.


from sklearn import tree

BMI_features = [ "NOR", "NOR", ... lots of data here ... , "OBE", "OBE"]
Height_in_Weight_lbs_samples = [[91,58],[96,58], … lots of data here … ,[279,76],[287,76]]

# Create identification tree from BMI table.
clf = tree.DecisionTreeClassifier()
clf = clf.fit(Height_in_Weight_lbs_samples, BMI_features)

looping = True
while( looping ):
weight = input(“Enter your weight in lbs: “)
if not weight:
break

height = input(“Enter your height in inches: “)
if not height:
break

prediction = clf.predict([[weight,height]])
print(“It appears that you are:”, prediction, “\r\n” )

Output of the program looks something like this. Yeah, I’m regretting both dessert and choosing this example.


Enter your weight in lbs: 225
Enter your height in inches: 70
It appears that you are: ['OBE']

Enter your weight in lbs: 175
Enter your height in inches: 70
It appears that you are: ['OVE']

Enter your weight in lbs: 168
Enter your height in inches: 70
It appears that you are: ['NOR']

Enter your weight in lbs:

Sometimes it can be useful to look directly at the generated decision tree. This code generates a visualization of the tree.


# Generate a graph visualizing the trained decesion tree.
import graphviz
dot_data = tree.export_graphviz(clf, out_file=None)
graph = graphviz.Source(dot_data)
graph.render(“BMI_Table”, view=True)

I put this code, including the full data sets for training up at: git@github.com:aarontoney/Machine_Learning_Examples.git

Stupider Friends

I really need to get stupider friends. Years ago my old climbing partner from Australia came to visit. For part of the trip, he and a friend of his Jesse (now a mutual friend) came up to Seattle for some climbing. I had to work so I missed the second half of the trip – but I leant Grant my rack.

Well the two of them were nice enough to re-mark my gear – you know – just so it was consistently marked and make sure it I got it all back. They marked it with pink tape. Hot pink tape. I would not be surprised to find out if they had to drive round to find it special. When a decade later – Grant, Jesse, and I went climbing in Colorado we went to upgrade gear – they presented me with a roll of hot pink duct-tape.

rackupgrade_mar_2018

So now whenever I get new gear – the first thing I do when I get home is mark it all with pink. Hot pink. I need to get stupider friends. This is a practical joke that’s been running for nearly a decade with no sign of ending. On the plus side – I have yet to meet someone who marks their gear with the same color.

Day climbing at deception crags (March 20th, 2018)

Made it out to Deception Crags for some more practice today, and boy did I need it.

We started the day by setting up by practicing rope work and rappelling again. Here you can see Tyler transitioning from his personal anchor system over to a rappel. It is hard to see in the picture, but he does have a prussic backup set up. Its actually a good spot to practice. Its a safe approach to the anchors, but immediately exposes you to a vertical drop to practice with.

20180320_131708

Must have action shot. We each did like 3 times down the rope. Main thing was practicing and drilling on coming onto our personal anchor system, then coming off it to a rappel. After that we pulled ropes and went to climb on write-off wall. That’s where things went a bit sideways.

20180320_132351

I was supposed to lead the unnamed climb, set up an anchor, and then come down so Tyler could climb it. I had never climbed this route before, so I got about 4-5 bolts up before realizing that while the line was straight up until then, at the very end of the climb it broke hard right and finished directly over another occupied climb. I was worried about showering the climbers with rocks and crap, so I ended up bailing off onto knife in the toaster, one climb over to our left. It was a dogs breakfast. The rope literally went up, over, and down.

Complicating things – and part of the problem, I think, was that I was more sketched out than I realized by leading the climb. The climb is totally a 5.6 on top-rope, but it has got two nasty looking falls. All in all I found leading and down climbing the 5.7/5.8 climb I had to do to get the gear I left behind easy by comparison. Something about that climb was just messing with my head. The climb runs about 3 feet to the left of the rope line Tyler is on in this picture.

In the end Tyler climbed it gracefully, then rappelled back off the climb.

20180320_152348

In a word, ugh. There are no bad days in the mountain when nobody gets hurt, but this was just a weird, weird, day. I’m glad we put in the time to practice. It was a fun if frustrating day, but I think I really needed the drill time.

Day climbing at deception crags (March 16th, 2018)

Tyler and I started the day by setting up a top rope on Glob Job. It is only a 5.7 – but neither of us wanted to lead climb it since it looks like it has a real ankle breaker of a first bolt start. Here you can see Tyler setting up the rope for the climb.

20180316_122334

Turns out, once we were on it the first bolts is not that bad, the crux move seems to be clearing a tiny bulge in the wall to get to the second bolt. I figure we will try leading it later in the summer but ick. The concrete crack is full of small sharp as stones and just ripped up our hands. I ended up coming off the climb and taping up to do it.

After that we hiked around to write-off rock and did the unnamed 5.6 between Knife in the Toaster and Mom There’s Pink in my Burger. It was Tyler’s first sport lead – and it was a decent climb for that. Solid holds and well protected. Only weird part is the chains are kind of far above the last decent footholds. So Tyler got up there only to find that the chain link PAS system I lent him was to short so it was kind of awkward for him setting things up. I felt bad. I use a longer anchor system, so had less problems breaking things down when I cleaned.

While we do need to get faster – it was all in all a good day. We got a bunch of solid rope work practice and climbing in. Joel and Owen met up with Tyler and myself.

20180316_153422

Owens only 6 – but it looks like he will be a hell of a climber one day. The picture had him motoring up a 5.4, and we had to physically lift him off the climb to prevent him from just motoring on up the rock. I’ll be curious to see how he does when we bring him out there with a harness and let him climb roped up. Probably the weirdest part – is the last time Joel was out there was when we were climbing together in college twenty years ago.

Day climbing at deception crags (March 10th, 2018)

Tyler and I took my Aunt climbing today at Deception Crags up at Exit 38. It had been nearly a decade since either of us had climbed there – so we went up without a guidebook and kind of played things by ear.

Unfortunately, I picked a 5.9 for my Aunts first climb – Knife in the toaster. It is smack in the middle of “Write-off Rock”. She did awesome, but it had a 5.4 on the left and a 5.6 climb on the right – so I just totally miss called the difficulty of the climb. I felt especially bad since right after she left we ran up flammable pajamas next which was the super easy 5.4.

Probably the first lesson of the day was that I look ridiculous in a pony tail. I’m choosing to ignore that lesson though.

climbing_mar11_2018_p1

Knife in the toaster had five bolts and was my first lead of the season, and first in nearly a year.

Its actually more like a 5.7 climb, with a 5.9 crux move at the end. I just was not seeing the last move though. I got the bold clipped, and kept going up and down looking for the next move but just was not seeing it.

Here you can see the line for the climb – I just need to head straight up and to the left a bit to nail it. In the end I finished by going round, and to my right. The lead for flammable pajamas was next, and it was super weird. You have to climb half way up the climb to get to the first bolt, so the climb with only two bolts, feels like it is over before it feels like it starts.

All in all a lovely day – we need to go out there again soon.

Ben holds a grudge

Ben must hold a grudge – I can’t tell what is worse – this picture of me or my board position. I got slaughtered at Go – but the Wisky was excellent. Come to think of it – those two things may be related. All in all I am calling the trip, if not the game, a win.

IMG_20180302_221414

Early march attempt of the Tooth

Yesterday Conrad and I attempted a snowshoe approach for climbing the Tooth. We were initially going to meet at the trailhead for a 9am start. This was the first snow shoe trip in several seasons for me, and Conrad’s first snowshoeing, so we ended up deciding to meet at REI at 9 instead so we could each pick up a few things.

Unfortunately, on Sunday REI does not open until 10 so we lost an hour. Then we got stuck in a crazy traffic snarl getting to the pass. So we did not hit the trail-head for our start until 2. Yeah, it was a super late start. We had no idea what the hike in times would be – so we decided to bring full climbing packs anyway in case we got lucky. If we didn’t then worst case we know our travel time with packs for when we come back.

I think we both had a fun day. I know Conrad got a lot of laughter in when I tried out a new down hill “glide step” technique. Which started out awesome for the first 3 steps – then it turned into me tumbling ass over teakettle down hill.

I think we both were having fun.

20180304_160858

20180304_153933

We hiked in to Source Lake, then turned south east to switch back up to the Great Scott Basin. In this picture its that V where the two dotted paths split just below the tooth. Not the chimney, the one on the snow field.

Screen Shot 2018-03-03 at 11.57.32 PM

We were loosing the light – so we turned around just under the 3 hour mark. We were making good time down the mountain so we took a break for a celebratory beer and some food before motoring on out. Thats going to become a tradition – because that beer was perfect.

It started snowing fairly heavily on the way out – which I found hilarious – since Konrad was at times like an abominable snowman in font of me. I found the half inch of snow hanging on the tip of his ice axe especially funny for some reason.

We ended up hiking out the last hour and a half with headlamps. No one was anywhere around – so the world seemed to be ours. Here is what it looked like. We only had to back track once – and that was my fault.

All in all al lovely day. With an 8am start next time we should have no problem summiting and hiking out before we loose the light. Cant wait.

Idea for an easily movable couch

I have started interviewing for gigs outside of Seattle – and that has me thinking about furniture and moving. Well the Seattle Bouldering Project has some awesome – but comfortable – collapsible chairs. I am thinking they would be easy to make, easy to move, and a decent way to store crash pads the 99.9% of the time when you are not climbing.

20180228_122530

This picture shows how the pieces of the chairs slot together (and Tyler tying his shoes). I’d probably add some reinforcement to the sides of the slots, but with a high quality plywood they should be really strong.

I am thinking I could sew up a cover for when they are acting as couch cushions. That would give me something easy to wash, and help a bit as the crash pads get used I am not dragging dirt or chalk dust back into the house.

20180228_122547

More I think about it the more I am really liking this idea.