Saturday, April 5, 2014
Saturday, March 15, 2014
Horse Racing: Introductory Analysis
I want to start getting my head around the basic information that could influence outcomes.
I have these race-level fields sloppily copy-pasted from the R terminal. I've bolded the fields that scream to me as being relevant.
"Track" "Date" "Race Number" "Race Type"
[5] "HorseType" "ClaimPriceHigh" "ClaimPriceLow" "Race Length"
[9] "Record Time" "Record Date" "Purse" "Plus"
[13] "Available Money" "Valueist" "Value2nd" "Value3rd"
[17] "Weather" "Track" "Time" "Start"
[21] "2FTime1" "2FTime2" "3FTime1" "3FTime2"
[25] "3FTime3" "4FTime1" "4FTime2" "4FTime3"
[29] "4FTime4" "t_Final" "2SplT1" "2SplT2"
[33] "3SplT1" "3SplT2" "3SplT3" "4SplT1"
[37] "4SplT2" "4SplT3" "4SplT4" "Run-up"
[41] "WPS_Pool" "ID"
Horse Type:
Of the 41 unique strings with horse type information, here's those that occur more than 10x.
Most horses are three or more years old. Most are fillies. This information might need to be coerced into similar groups to be useful but until I know what all of this means, I don't want to lose information.
I've only seen fillies run since I've been out so I'm assuming this is the norm. The highest frequency entry at the top is vague about whether they are male/female so I'm going to assume female unless otherwise specified.
For Thoroughbred Three Year Old and Upward
59
For Thoroughbred Three Year Old and Upward Fillies and Mares
27
For Thoroughbred Three Year Old Fillies
23
For Thoroughbred Four Year Old and Upward
23
For Thoroughbred Two Year Old
21
For Thoroughbred Three Year Old
20
For Thoroughbred Two Year Old Fillies
16
For Thoroughbred Three Year Old and Upward (NW2 L)
15
For Thoroughbred Three Year Old and Upward Fillies and Mares (NW2 L)
14
For Thoroughbred Three Year Old and Upward Fillies and Mares (NW2 L X)
12
For Thoroughbred Four Year Old and Upward Fillies and Mares
11
For Claim Price:
As an incentive against running superior horses, horses can be bought at the "claim price" after each race (at least, that's my understanding). This keeps a competitive~ish market. I'm not sure if this name is a relic of days past or not and it warrants further investigation. Most of the races are under 10k with a decent showing from 35-40k.
Race Length:
Win, Place, Show Values:
So that's a start. More to come.
I have these race-level fields sloppily copy-pasted from the R terminal. I've bolded the fields that scream to me as being relevant.
"Track" "Date" "Race Number" "Race Type"
[5] "HorseType" "ClaimPriceHigh" "ClaimPriceLow" "Race Length"
[9] "Record Time" "Record Date" "Purse" "Plus"
[13] "Available Money" "Valueist" "Value2nd" "Value3rd"
[17] "Weather" "Track" "Time" "Start"
[21] "2FTime1" "2FTime2" "3FTime1" "3FTime2"
[25] "3FTime3" "4FTime1" "4FTime2" "4FTime3"
[29] "4FTime4" "t_Final" "2SplT1" "2SplT2"
[33] "3SplT1" "3SplT2" "3SplT3" "4SplT1"
[37] "4SplT2" "4SplT3" "4SplT4" "Run-up"
[41] "WPS_Pool" "ID"
Horse Type:
Of the 41 unique strings with horse type information, here's those that occur more than 10x.
Most horses are three or more years old. Most are fillies. This information might need to be coerced into similar groups to be useful but until I know what all of this means, I don't want to lose information.
I've only seen fillies run since I've been out so I'm assuming this is the norm. The highest frequency entry at the top is vague about whether they are male/female so I'm going to assume female unless otherwise specified.
For Thoroughbred Three Year Old and Upward
59
For Thoroughbred Three Year Old and Upward Fillies and Mares
27
For Thoroughbred Three Year Old Fillies
23
For Thoroughbred Four Year Old and Upward
23
For Thoroughbred Two Year Old
21
For Thoroughbred Three Year Old
20
For Thoroughbred Two Year Old Fillies
16
For Thoroughbred Three Year Old and Upward (NW2 L)
15
For Thoroughbred Three Year Old and Upward Fillies and Mares (NW2 L)
14
For Thoroughbred Three Year Old and Upward Fillies and Mares (NW2 L X)
12
For Thoroughbred Four Year Old and Upward Fillies and Mares
11
For Claim Price:
As an incentive against running superior horses, horses can be bought at the "claim price" after each race (at least, that's my understanding). This keeps a competitive~ish market. I'm not sure if this name is a relic of days past or not and it warrants further investigation. Most of the races are under 10k with a decent showing from 35-40k.
Race Length:
For conversion purposes, one mile is eight furlongs. Most races are significantly shorter than a mile. I'd guess that the shorter races have more variable outcomes as it's easy to see that the odds out horses lose their pace in the longer races but compete well in the shorter runs.
Six Furlongs On The All Weather Track
111
One Mile On The All Weather Track
71
Five And One Half Furlongs On The All Weather Track
68
One Mile On The Turf
47
One And One Sixteenth Miles On The Turf
30
One And One Sixteenth Miles On The All Weather Track
17
Five Furlongs On The All Weather Track
12
111
One Mile On The All Weather Track
71
Five And One Half Furlongs On The All Weather Track
68
One Mile On The Turf
47
One And One Sixteenth Miles On The Turf
30
One And One Sixteenth Miles On The All Weather Track
17
Five Furlongs On The All Weather Track
12
Win, Place, Show Values:
A more relevant range of interest occurs from 20k down (as the majority of races win values lie in that range). For this price range, win amounts look linear with a favorable slope for 3rd. As the price range goes up, the results skew nonlinear with 3rd offering less % value. Incentive for a conservative strategy, maybe? Worth noting even if I don't have the racing strategies down yet.
Start:
90% of races are flagged as having "a good start for all". The horse in PP1 starts worst most frequently at 2% but this isn't really fair given that the outside PP changes so I would have to compare race group by race group and I don't want to do that right now.
So that's a start. More to come.
The Bukowski Sequence:
Like any other 17 year old guy, Bukowski was a really fun read. Different era, different perspective. Slums, women, booze, fights, all make for a potent youth catnip. It got me back into reading so I have been grateful for that. More recently, and partially due to a location change, I've chased down another Bukowski fixture: the horses. As I like uncertainty and probalistic estimates, I figured I'd throw my hat into the ring.
First step was to acquire the dataset. Copy, pastes and the following script helped out with that:
First step was to acquire the dataset. Copy, pastes and the following script helped out with that:
Subscribe to:
Posts (Atom)