mildbyte's notes tagged 'scala'

(notes) / (tags) / (streams)

View: compact / full
Time: backward / forward
Source: internal / external / all
Public RSS / JSON

(show contents)

project Betfair, part 8

@mildbyte 5 years, 5 months ago | programming | python | betfair | scala |

Introduction

Previously on project Betfair, we gave up on market making and decided to move on to a simpler Betfair systematic trading idea that seemed to work.

Today on project Betfair, we'll trade it live and find out it doesn't really work.

DAFBot on greyhounds: live trading

With the timezone bug now fixed, I was ready to let my bot actually trade some greyhound races for a while. I started trading with the following parameters:

  • Favourite selection time: 180s
  • Entry order time: 160s
  • Exit time: 60s
  • Bet size: £5

I decided to pick the favourite closer to the trade since I thought that if it were picked 5 minutes before the race starting, it could change and there was no real point in locking it in so long before the actual trade. The 60s exit point was mostly to give me a margin of safety to liquidate the position manually if something happened as well as in case of the market going in-play before the advertised time (there's no in-play trading in greyhound markets, so I'd carry any exposure into the game. At that point, it would become gambling and not trading).

Results

So how did it do?

Well, badly for now. Over the course of 25 races, in 4 hours, it lost £5.93: an average of -£0.24 per race with a standard deviation of £0.42. That was kind of sad and slightly suspicious too: according to the fairly sizeable dataset I had backtested this strategy on, its return was, on average, £0.07 with a standard deviation of £0.68.

I took the empirical CDF of the backtested distribution of returns, and according to it, getting a return of less than -£5.93 with 25 samples had a probability of 1.2%. So something clearly was going wrong, either with my backtest or with the simulation.

I scraped the stream market data out of the bot's logs and ran the backtest just on those races. Interestingly, it predicted a return of -£2.70. What was going wrong? I also scraped the traded runners, the entry and the exit prices from the logs and from the simulation to compare. They didn't match! A few times the runner that the bot was trading was different, but fairly often the entry odds that the bot got were lower (recall that the bot wants to be long the market, so entering at lower odds (higher price/implied probability) is worse for it). Interestingly, there was almost no mismatch in the exit price: the bot would manage to close its position in one trade without issues.

After looking at the price charts for a few races, I couldn't understand what was going wrong. The price wasn't swinging wildly to fool the bot into placing orders at lower odds: in fact, the price 160s before the race start was just different from what the bot was requesting.

Turns out, it was yet another dumb mistake: the bot was starting to trade 150s before the race start and pick the favourite at that point as well. Simulating what the bot did indeed brought the backtest's PnL (on just those markets) within a few pence from the realised PnL.

So that was weird: moving the start time by 10 seconds doubled the loss on that dataset (by bringing it from -£2.70 to -£5.93).

Greyhound capacity analysis

There was another issue, though: the greyhound markets aren't that liquid.

While there is about £10000-£15000 worth of bets available to match against in an average greyhound race, this also includes silly bets (like offering to lay at 1000.0).

To demonstrate this better, I added market impact to the backtester: even assuming that the entry bet gets matched 160s before the race (which becomes more difficult to believe at higher bet amounts, given that the average total matched volume by that point is around £100), closing the bet might not be possible to do completely at one odds level: what if there isn't enough capacity available at that level and we have to place another lay bet at higher odds?

Here's some code that simulates that:

def get_long_return(lines, init_cash, startT, endT, suspend_time,
    market_impact=True, cross_spread=False):

    # lines is a list of tuples: (timestamp, available_to_back,
    #    available_to_lay, total_traded)
    # available to back/lay/traded are dictionaries
    #   of odds -> availablility at that level

    # Get start/end availabilities
    start = get_line_at(lines, suspend_time, startT)
    end = get_line_at(lines, suspend_time, endT)

    # Calculate the inventory 
    # If we cross the spread, use the best back odds, otherwise assume we get
    # executed at the best lay
    if cross_spread:
        exp = init_cash * max(start[1])
    else:
        exp = init_cash * min(start[2])

    # Simulate us trying to sell the contracts at the end
    final_cash = 0.

    for end_odds, end_avail in sorted(end[2].iteritems()):

        # How much inventory were we able to offload at this odds level?
        # If we don't simulate market impact, assume all of it.
        mexp = min(end_odds * end_avail, exp) if market_impact else exp
        exp -= mexp

        # If we have managed to sell all contracts, return the final PnL.
        final_cash += mexp / end_odds
        if exp < 1e-6:
            return final_cash - init_cash

    # If we got to here, we've managed to knock out all price levels 
    # in the book.
    return final_cash - init_cash

I then did several simulations of the strategy at different bet sizes.

Turns out, as we increase the bet size away from just £1, the PnL quickly decays (the vertical lines are the error bars, not the standard deviations). For example, at bet size of £20, the average return per race is just £0.30 with a standard deviation of about £3.00 and a standard error of £0.17.

DAFBot on horses

At that point, I had finally managed to update my new non-order-book simulator so that it could work on horse racing data, which was great, since horse markets were much more preferable to greyhound ones: they were more liquid and there was much less opportunity for a single actor to manipulate prices. Hence there would be more capacity for larger bet sizes.

In addition, given that the spreads in horses are much tighter, I wasn't worried about having a bias in my backtests (the greyhound one assumes we can get executed at the best lay, but most of its PnL could have come from the massive back-lay spread at 160s before the race, despite that I limited the markets in the backtest to those with spreads below 5 ticks).

Research

I backtested a similar strategy on horse data but, interestingly enough, it didn't work: the average return was well within the standard error from 0.

However, flipping the desired position (instead of going long the favourite, betting against it) resulted in a curve similar to the one for the greyhound strategy. In essence, it seemed as if there was an upwards drift in the odds on the favourite in the final minutes before the race. Interestingly, I can't reproduce those results with the current, much larger, dataset that I've gathered (even if I limit the data to what I had at that point), so the following results might be not as exciting.

The headline number, according to my notes at that time, was that, on average, with £20 lay bets, entering at 300 seconds before the race and exiting at 130 seconds, the return was £0.046 with a standard error of £0.030 and a standard deviation of £0.588. This seems like very little, but the £20 bet size would be just a start. In addition, there are about 100 races every day (UK + US), hence annualizing that would result in a mean of £1690 and a standard deviation of £112.

This looked nice (barring the unrealistic Sharpe ratio of 15), but the issue was that it didn't scale well: at bet sizes of £100, the annualized mean/standard deviation would be £5020 and £570, respectively, and it would get worse further on.

I also had found out that, at £100 bet sizes, limiting the markets to just those operating between 12pm and 7pm (essentially just the UK ones) gave better results, despite that the strategy would only be able to trade 30 races per day. The mean/standard deviation were £4220 and £310, respectively: a smaller average return and a considerably smaller standard deviation. This was because the US markets were generally thinner and the strategy would crash through several price levels in the book to liquidate its position.

Note this was also using bet sizes and not exposures: so to place a lay of £100 at, say, 4.0, I would have to risk £300. I didn't go into detailed modelling of how much money I would need deposited to be able to trade this for a few days, but in any case I wasn't ready to trade with stakes that large.

Placing orders below £2

One of the big issues with live trading the greyhound DAFBot was the fact that the bot can't place orders below £2. Even if it offers to buy (back), say, £10 at 2.0, only £2 of its offering could actually get matched. After that point, the odds could go to, say, 2.5, and the bot would now have to place a lay bet of £2 * 2.0 / 2.5 = £1.6 to close its position.

If it doesn't do that, it would have a 4-contract exposure to the runner that it paid £2 for (will get £4 if the runner wins for a total PnL of £2 or £0 if the runner doesn't win for a total PnL of -£2).

If it instead places a £2 lay on the runner, it will have a short exposure of 2 * 2.0 - 2 * 2.5 = -1 contract (in essence, it first has bought 4 contracts for £2 and now has sold 5 contracts for £2: if the runner wins, it will lose £1, and if the runner loses, it will win nothing). In any case, it can't completely close its position.

So that's suboptimal. Luckily, Betfair documents a loophole in the order placement mechanism that can be used to place orders below £2. They do say that it should only be used to close positions and not for normal trading (otherwise people would be trading with £0.01 amounts), but that's exactly our use case here.

The way it's done is:

  • Place an order above £2 that won't get matched (say, a back of 1000.0 or a lay of 1.01);
  • Partially cancel some of that order to bring its unmatched size to the desired amount;
  • Use the order replacement instruction to change to odds of that order to the desired odds.

Live trading

I started a week of fully automated live trading on 2nd October. That was before I implemented placing bets below £2 and the bot kind of got lucky on a few markets, unable to close its short exposure fully and the runner it was betting against losing in the end. That was nice, but not exactly intended. I also changed the bot to place bets based on a target exposure of 10 contracts (as opposed to stakes of £10, hence the bet size would be 10 / odds).

In total, the bot made £3.60 on that day after trading 35 races.

Things quickly went downhill after I implemented order placement below £2:

  • 3rd October: -£3.41 on 31 races (mostly due to it losing £3.92 on US markets after 6pm);
  • 4th October: increased bet size to 15 contracts; PnL -£2.59 on 22 races;
  • 5th October: made sure to exclude markets with large back-lay spreads from trading; -£2.20 on 25 races (largest loss -£1.37, largest win £1.10);
  • 6th October: -£3.57 on 36 races;
  • 7th October: -£1.10 on 23 races.

In total, the bot lost £9.27 over 172 races, which is about £0.054 per race. Looking at Betfair, the bot had made 395 bets (entry and exit, as well as additional exit bets at lower odds levels when there wasn't enough available at one level) with stakes totalling £1409.26. Of course, it wasn't risking more than £15 at any point, but turning over that much money without issues was still impressive.

What wasn't impressive was that it consistently lost money, contrary to the backtest.

Conclusion

At that point, I was slowly growing tired of Betfair. I played with some more ideas that I might write about later, but in total I had been at it for about 2.5 months and had another interesting project in mind. But project Betfair for now had to go on an indefinite hiatus.

To be continued...

Enjoyed this series? I do write about other things too, sometimes. Feel free to follow me on twitter.com/mildbyte or on this RSS feed! Alternatively, register on Kimonote to receive new posts in your e-mail inbox.

Interested in this blogging platform? It's called Kimonote, it focuses on minimalism, ease of navigation and control over what content a user follows. Try the demo here or the beta here and/or follow it on Twitter as well at twitter.com/kimonote!

project Betfair, part 7

@mildbyte 5 years, 5 months ago | programming | python | betfair | scala |

Introduction

Previously on project Betfair, we ran the production market-making CAFBot in Scala, got it working, made some money, lost some money and came back with some new ideas.

Today, we'll test those ideas, look at something that's much more promising and learn a dark truth about timezones.

Sorry this took a bit too long to write, by the way, I've been spending some time working on Kimonote to add email digests to streams. The idea is that given some exporters from streams (e-mail, RSS (already works), Facebook, Twitter etc) with varying frequencies (immediately or weekly/daily digests) as well as some importers (again RSS/Twitter/Facebook/other blogging platforms) a user could create their own custom newsletter and get it sent to their mailbox (push) instead of wasting time checking all those websites (pull), as well as notify their social media circles when they put a new post up anywhere else. None of this works yet, but other features do — if you're interested, sign up for the beta here!

Shameless plug over, back to the world of automated trading goodness.

CAFBotV2 with wider spreads

Remember how in the real-world greyhound market the bot managed to have some of its bets matched despite that they were several ticks away from the best back and lay? I realised I never really tested that in simulation: I started from making the market at the top of the order book and kind of assumed that further away from there matching would never happen. Looks like I was wrong (and in fact in part 5 the bot had issues with its bets that were moved away from the current market because of high exposure getting matched anyway).

So I added (backported?) the levelBase parameter from the Scala bot into the research one: recall that it specified how far from the best back/lay the bot should start working before applying all the other offsets (exposure or order book imbalance). Hence at levelBase = 0 the bot would work exactly as before and with levelBase = 1 it would start 1 Betfair tick away from the best back/lay. levelBase = 3 is what was traded live on the greyhound market.

The idea behind this is kind of simple: if the bot still gets its bets matched even if it's far away from the best back/lay, it will earn a higher spread with fewer trades.

So, first, I ran it on our favourite horse market with levelBase = 1.

Results

Horses, levelBase = 1

It didn't do well: there were very few matches and so most of the time it just did nothing, trading a grand total of 3 times. This meant that it got locked into an inventory that it couldn't offload.

Let's run it on the whole dataset: though this market didn't work as well, in some other ones matching did happen in jumps larger than 1 tick, so those might be able to offset more liquid markets.

We're tantalizingly close to finally having a PnL of zero (the more observant reader might notice that we could have done the same by not trading at all). Let's see how it would have done on the greyhound markets, which we do know sometimes jump like crazy.

Greyhounds, levelBase = 3

Not very inspiring either. There's a large amount of markets where this wouldn't have done anything at all (since the bets were so far from the best back/lay, they don't ever get hit), and when something does happen, it seems to be very rare, so the bot can't liquidate its position and remains at the mercy of the market.

So while that was a cute idea, it didn't seem to work.

DAFBot

At this point, I was running out of ideas. The issue of the bot getting locked into an inventory while the market was trending against it still remained, so I had to look at the larger-scale patterns in the data: perhaps based on the bigger moves in the market, the bot could have a desired inventory it could aim for (instead of always targeting zero inventory).

Consider this: if we think that the market is going to move one way or another, it's okay for the bot to have an exposure that way and it can be gently guided towards it (by means of where it sets its prices). Like that, the bot would kind of become a hybrid of a slower trading strategy with a market maker: even if its large-scale predictions of price movements weren't as good, they would get augmented by the market making component and vice versa.

Carry?

I tried out a very dumb idea. Remember how in most of the odds charts we looked at the odds, for some reason, trended down? I had kind of noticed that, or at least I thought I did, and wanted to quantify it.

Those odds were always on the favourite (as in, pick the greyhound/horse with the lowest odds 1 hour before the race begins and see how they change). The cause could be that, say, people who wanted to bet on the favourite would delay their decision to see if any unexpected news arrived before the race start, which is the only sort of news that could move the market.

Whatever the unexpected news would be, they would likely affect the favourite negatively: they could be good for any of the other greyhounds/horses, thus making it more likely for them to win the race. Hence it would make sense, if someone wanted to bet on the favourite, for them to wait until just before the race begins to avoid uncertainty, thus pushing the odds down as the race approaches.

So what if we took the other side of this trade? If we were to go long the favourite early, we would benefit from this downwards slide in odds, at the risk of some bad news coming out and us losing money. I guessed this would be similar to a carry trade in finance, where the trader makes money if the market conditions stay the same (say, borrowing money in a lower interest rate currency and then holding it in a higher interest rate currency, hoping the exchange rate doesn't move). In essence, we'd get paid for taking on the risk of unexpected news about the race coming out.

Research: greyhounds

I had first started doing this using my order book simulator, but realised it would be overkill: if the only thing I wanted to do was testing a strategy that traded literally twice (once to enter the position, once to close it), it would be better write a custom scraper from the stream data that would get the odds' timeseries and simulate the strategy faster.

At that point, I realised the horse racing stream data was too large to fit into memory with the new simulator. So I put that on hold for a second and tried my idea on greyhound markets.

This chart plots, at each point in time before the race, the average return on going long (backing) the favourite and then closing our position 15s before the race begins. In any case, the favourite is picked 5 minutes before the race begins. The vertical lines are the error bars (not standard deviations). Essentially, what we have here is a really consistent way to lose money.

This is obviously because of the back-lay spreads: the simulation here assumes we cross the spread both when entering and exiting, in essence taking the best back in the beginning and the best lay at the end.

Remember this chart from part 4?

The average spread 120s before a greyhound race begins is about 5 ticks. We had previously calculated that the loss on selling 1 tick lower than we bought is about 0.5%, so no wonder we're losing about 3% of our investment straight away.

What if we didn't have to cross the spread?

Woah. This graph assumes that instead of backing the runner at the best back, we manage to back it at the current best lay (by placing a passive back at those odds). When we're exiting the position just before the race begins, time is of the essence and so we're okay with paying the spread and placing a lay bet at the odds of the current best lay (getting executed instantly).

The only problem is actually getting matched: since matching in greyhound markets starts very late (as much money gets matched in the last 80 seconds as does before), our bet could just remain in the book forever, or get matched much closer to the beginning of the race.

But here's the fun part: this graph doesn't care. It shows that if the bet is matched at whatever the best lay was 160 seconds before the race, on average this trade makes money — even if the actual match happens a minute later. If the bet doesn't get matched at all, the trade simply doesn't happen.

This does assume that the performance of this strategy is independent of whether or not the bet gets hit at all, but if that's not the case, we would have been able to use the fact that our bet got hit as a canary: when it gets hit, we know that being long this market is a good/bad thing and adjust our position accordingly.

Implementation in Scala

With that reasoning, I went to work changing the internals of Azura to write another core for it and slightly alter the way it ran. The algorithm would be:

  • Run with parameters: market, favourite selection time, entry start time, entry end time, exit time (all in seconds before the race start), amount to bet.
  • Subscribe to the market/order streams, get the market suspend time from the Betfair API
  • At favourite selection time: inspect our local order book cache, select the runner with the lowest odds
  • At entry start time: place a passive back at the current best lay odds of the given amount on the favourite.
  • At entry end time: cancel remaining unexecuted orders.
  • At exit time: if we have a position (as in our entry order got hit), unwind it aggressively.

I called the new core DAFBot (D stands for Dumb and what AF stands for can be gleaned from the previous posts). I wanted to reuse the idea of polling a strategy for the orders that it wished to offer and the core being stateless, since that would mean that if the bot crashed, I could restart it and it would proceed where it left off. That did mean simple actions like "buy this" became more difficult to encode: the bot basically had to look at its inventory and then say "I want to maintain an outstanding back bet for (how much I want to buy - how much I have)".

Finally, yet another Daedric prince got added to my collection: Sheogorath, "The infamous Prince of Madness, whose motives are unknowable" (I had given up on my naming scheme making sense by this point), would schedule instances of Azura to be run during the trading day by using the Betfair API to fetch a list of greyhound races and executing Azura several minutes before that.

Live trading

I obviously wasn't ready to get Sheogorath to execute multiple instances of Azura and start losing money at computer speed quite yet, so for now I ran the new strategy manually on some races, first without placing any bets (just printing them out) and then actually doing so.

The biggest issue was the inability to place bets below £2. I had thought this wouldn't be a problem (as I was placing entry bets with larger amounts), but fairly often only part of the offer would get hit, so the bot would end up having an exposure that it wasn't able to close (since closing it would entail placing a lay bet below £2). Hence it took some of that exposure into the race, which wasn't good.

Timing bug?

In addition, when testing Sheogorath's scheduling (by getting it to kick off instances of Azura that didn't place bets), I noticed a weird thing: Sheogorath would start Azura one minute later than intended. For example, for a race that kicked off at 3pm, Azura was supposed to be started 5 minutes before that (2:55pm) whereas it was actually executed at 2:56pm.

While investigating this, I realised that there was another issue with my data: I had relied on the stream recorder using the market suspend times that were fetched from Betfair to stop recording, but that might not have been the case: if the greyhound race started before the scheduled suspend time, then the recording would stop abruptly, as opposed to at the official suspend time.

Any backtest that counted backwards from the final point in the stream would kind of have access to forward-looking information: knowing that the end of the data is the actual suspend time, not the advertised one.

Hence I had to recover the suspend times that the recorder saw and use those instead. I still had all of the logs that it used, so I could scrape the times from them. But here was another fun thing: spot-checking some suspend times against Betfair revealed that they sometimes also were 1 minute later than the ones on the website.

That meant the forward-looking information issue was a bigger one, since the recorder would have run for longer and have a bigger chance of being interrupted by a race start. It would also be a problem in horse markets: since those can be traded in-play, there could have been periods of in-play trading in my data that could have affected the market-making bot's backtests (in particular, everyone else in the market is affected by the multisecond bet delay which I wasn't simulating).

But more importantly, why were the suspend times different? Was it an issue on Betfair's side? Was something wrong with my code? It was probably the latter. After meditating on more logfiles, I realised that the suspend times seen by Azura were correct whereas the suspend times for Sheogorath for the same markets were 1 minute off. They were making the same request, albeit at different times (Sheogorath would do it when building up a trading schedule, Azura would do it when one of its instances would get started). The only difference was that the former was written in Python and the latter was written in Scala.

After some time of going through my code with a debugger and perusing documentation, I learned a fun fact about timezones.

A fun fact about timezones

I used this bit of code to make sure all times the bot was handing were in UTC:

def parse_dt(dt_str, tz=None):
    return dt.strptime(dt_str, '%Y-%m-%dT%H:%M:%S.%fZ').replace(tzinfo=pytz.timezone(tz) if tz else None)

m_end_dt = parse_dt(m['marketStartTime'], m['event']['timezone'])
m_end_dt = m_end_dt.astimezone(pytz.utc).replace(tzinfo=None)

However, timezones change. Since pytz.timezone doesn't know the time of the timezone its argument refers to, it looks at the earliest definition of the timezone, which in the case of Europe/London is back in mid-1800s. Was the timezone offset back then something reasonable, like an hour? Nope, it was 1 minute.

Here's a fun snippet of code so you can try this at home:

In[4]: 
from datetime import datetime as dt
import pytz
def parse_dt(dt_str, tz=None):
    return dt.strptime(dt_str, '%Y-%m-%dT%H:%M:%S.%fZ').replace(tzinfo=pytz.timezone(tz) if tz else None)
wtf = '2017-09-27T11:04:00.000Z'
parse_dt(wtf)
Out[4]: datetime.datetime(2017, 9, 27, 11, 4)
In[5]: parse_dt(wtf, 'Europe/London')
Out[5]: datetime.datetime(2017, 9, 27, 11, 4, tzinfo=[DstTzInfo 'Europe/London' LMT-1 day, 23:59:00 STD])
parse_dt(wtf, 'Europe/London').astimezone(pytz.utc)
Out[6]: datetime.datetime(2017, 9, 27, 11, 5, tzinfo=[UTC])

And here's an answer from the django-users mailing group on the right way to use timezones:

The right way to attach a pytz timezone to a naive Python datetime is to call tzobj.localize(dt). This gives pytz a chance to say "oh, your datetime is in 2015, so I'll use the offset for Europe/London that's in use in 2015, rather than the one that was in use in the mid-1800s"

Finally, here's some background on how this offset was calculated.

Recovery

Luckily, I knew which exact days in my data were affected by this bug and was able to recover the right suspend times. In fact, I've been lying to you this whole time and all of the plots in this blog series were produced after I had finished the project, with the full and the correct dataset. So the results, actually, weren't affected that much and now I have some more confidence in them.

Conclusion

Next time on project Betfair, we'll teach DAFBot to place orders below £2 and get it to do some real live trading, then moving on to horses.

As usual, posts in this series will be available at https://kimonote.com/@mildbyte:betfair or on this RSS feed. Alternatively, follow me on twitter.com/mildbyte.

Interested in this blogging platform? It's called Kimonote, it focuses on minimalism, ease of navigation and control over what content a user follows. Try the demo here or the beta here and/or follow it on Twitter as well at twitter.com/kimonote!

project Betfair, part 6

@mildbyte 5 years, 5 months ago | programming | betfair | scala |

Introduction

Previously on project Betfair, we spent some time tuning our market making bot and trying to make it make money in simulation. Today, we'll use coding and algorithms to make (and lose) us some money in real life.

Meet Azura

Scala is amazing. At the very least, it can be written like a better Java (with neat features like allowing multiple classes in one file) and then that can evolve into never having to worry about NullPointerExceptions and the ability to write blog posts about monads.

And strictly typed languages are fun! There's something very empowering about being able to easily refactor code and rename, extract, push and pull methods around, whilst being mostly confident that the compiler is going to stop you from doing stupid things. Unlike, say, Python, where you have to write (and maintain) a massive test suite to make sure every line of your code gets executed, here proper typing can replace a whole class of unit tests.

Hence (and because I kind of wanted to relearn Scala) I decided to write the live trading version of my Betfair market-making bot I called Azura in Scala.

This will mostly be a series of random tales, code snippets, war stories and postmortems describing its development.

Build/deploy process

The actual design of the bot (common between the research Python version and the production Scala version) is described here.

In terms of libraries, I didn't have to use many: the biggest one was probably the JSON parser from the Play! framework. I used SBT (Simple Build Tool ("never have I seen something so misnamed" — coworker, though I didn't have problems with it)) to manage dependencies and build the bot.

I didn't really have a clever deploy procedure: I would log onto the "production" box, pull the code and use

sbt assembly

to create a so-called uber-jar, a Java archive that has all of the project's dependencies packaged inside of it. So executing it with the Java Virtual Machine would be simply a matter of

java -Dazura.dry_run=false -jar target/scala-2.12/azura-assembly-0.1.jar arguments

Disadvantages: this takes up space and possibly duplicates libraries that already exist on the target machine. Advantages: we don't depend on the target machine having the right library versions or the Java classpath set up correctly: the machine only needs to have the right version of the JVM.

Parsing the stream messages

object MarketData {
  def parseMarketChangeMessage(message: JsValue): JsResult[MarketChangeMessage] = message.validate[MarketChangeMessage](marketChangeMessageReads)

    case class RunnerChange(runnerId: Int, backUpdates: Map[Odds, Int],
                            layUpdates: Map[Odds, Int],
                            tradedUpdates: Map[Odds, Int])

    case class MarketChange(marketId: String, runnerChanges: Seq[RunnerChange], isImage: Boolean)

    implicit val oddsReads: Reads[Odds] = JsPath.read[Double].map(Odds.apply)
    implicit val orderBookLineReads: Reads[Map[Odds, Int]] = JsPath.read[Seq[(Odds, Double)]].map(
      _.map { case (p, v) => (p, (v * 100D).toInt) }.toMap)
    implicit val runnerChangeReads: Reads[RunnerChange] = (
      (JsPath \ "id").read[Int] and
      (JsPath \ "atb").readNullable[Map[Odds, Int]].map(_.getOrElse(Map.empty)) and
      (JsPath \ "atl").readNullable[Map[Odds, Int]].map(_.getOrElse(Map.empty)) and
      (JsPath \ "trd").readNullable[Map[Odds, Int]].map(_.getOrElse(Map.empty))) (RunnerChange.apply _)
    implicit val marketChangeReads: Reads[MarketChange] = (
      (JsPath \ "id").read[String] and
      (JsPath \ "rc").read[Seq[RunnerChange]] and
      (JsPath \ "img").readNullable[Boolean].map(_.getOrElse(false))) (MarketChange.apply _)
    implicit val marketChangeMessageReads: Reads[MarketChangeMessage] = (
      (JsPath \ "pt").read[Long].map(new Timestamp(_)) and
      (JsPath \ "mc").readNullable[Seq[MarketChange]].map(_.getOrElse(Seq.empty))) (MarketChangeMessage.apply _)
    case class MarketChangeMessage(timestamp: Timestamp, marketChanges: Seq[MarketChange])
  }

This might take some time to decipher. The problem was converting a JSON-formatted order book message into a data structure used inside the bot to represent the order book — and there, of course, can be lots of edge cases. What if the message isn't valid JSON? What if the structure of the JSON isn't exactly what we expected it to be (e.g. a subentry is missing)? What if the type of something isn't what we expected it to be (instead of a UNIX timestamp we get an ISO-formatted datetime)?

The Play! JSON module kind of helps us solve this by providing a DSL that allows to specify the expected structure of the JSON object by combining smaller building blocks. For example, marketChangeReads shows how to parse a message containing changes to the whole market order book (MarketChangeMessage). We first need to read a string containing the market ID at "id", then a sequence of changes to each runner (RunnerChange) located at "rc" and then a Boolean value at "img" that says whether it's a full image (as in we need to flush our cache and replace it with this message) or not.

To read a RunnerChange (runnerChangeReads), we need to read an integer containing the runner's ID and the changes to its available to back, lay and traded odds. To read those changes (orderBookLineReads), we want to parse a sequence of tuples of odds and doubles, convert the doubles (represending pound amounts available to bet or traded) into integer penny values and finally turn that into a map.

Finally, parsing the odds is simply a matter of creating an Odds object from the double value representing the odds (which rounds the odds to the neearest ones allowed by Betfair).

Core of the Scala CAFBotV2

  override def getTargetMarket(orderBook: OrderBook, runnerOrders: RunnerOrders): (Map[Odds, Int], Map[Odds, Int]) = {
    val logger: Logger = Logger[CAFStrategyV2]

    val exposure = OrderBookUtils.getPlayerExposure(runnerOrders).net
    val exposureFraction = exposure / offeredExposure.toDouble
    logger.info(s"Net exposure $exposure, fraction of offer: $exposureFraction")

    if (orderBook.availableToBack.isEmpty || orderBook.availableToLay.isEmpty) {
      logger.warn("One half of the book is empty, doing nothing")
      return (Map.empty, Map.empty)
    }

    logger.info(s"Best back: ${orderBook.bestBack()}, Best lay: ${orderBook.bestLay()}")
    val imbalance = (orderBook.bestBack()._2 - orderBook.bestLay()._2) / (orderBook.bestBack()._2 + orderBook.bestLay()._2).toDouble
    logger.info(s"Order book imbalance: $imbalance")

    // Offer a back -- i.e. us laying
    val backs: Map[Odds, Int] = if (exposureFraction >= -2) {
      val start = Odds.toTick(orderBook.bestBack()._1) +
        (if (exposureFraction >= -exposureThreshold && imbalance >= -imbalanceThreshold) 0 else -1) - levelBase
      (start until (start - levels) by -1).map { p: Int => Odds.fromTick(p) -> (offeredExposure / Odds.fromTick(p).odds).toInt }.toMap
    } else Map.empty

    val lays: Map[Odds, Int] = if (exposureFraction <= 2) {
      val start = Odds.toTick(orderBook.bestLay()._1) +
        (if (exposureFraction <= exposureThreshold && imbalance <= imbalanceThreshold) 0 else 1) + levelBase
      (start until start + levels).map { p: Int => Odds.fromTick(p) -> (offeredExposure / Odds.fromTick(p).odds).toInt }.toMap
    } else Map.empty

    (backs, lays)
  }

The core is actually fairly simple and similar to the research version of CAFBotV2. Here, exposure really means inventory, as in amount of one-pound contracts the bot is holding. Instead of absolute inventory values for its limits (like 30 contracts in the previous post), the bot operates with fractions of amount of contracts it offers over its position (say, the previous limit of 30 contracts would have been represented here as 30/10 = 3).

After calculating the fraction and the order book imbalance, the bot calculates the back and lay bets it wishes to maintain in the book: first, the tick number it wishes to start on (the best back/lay or one level below/above if the order book imbalance or inventory is too negative/positive) and then each odds level counting down/up from that point. Finally, it divides the amount of contracts it wishes to offer by the actual odds in order to output the bet (in pounds) it wishes the be maintained at each price level.

There's also a custom levelBase parameter that allows to control how far from the best back/lay the market is made: with a levelBase of zero, the bot would place its bets at the best back/lay, with levelBase of one the bets would be 1 tick below/above best back/lay etc.

First messages on the stream and a dummy run

I sadly do not any more have the old logs back from when I managed to receive (and parse!) the first few messages from the Stream API, so here's a dramatic reconstruction.

20:46:58.093 [main] INFO main - Starting Azura...
20:46:58.096 [main] INFO main - Getting market suspend time...
20:46:58.648 [main] INFO main - {"filter":{"marketIds":["1.134156568"]},"marketProjection":["MARKET_START_TIME"],"maxResults":1}
20:46:59.793 [main] INFO main - [{"marketId":"1.134156568","marketName":"R9 6f Claim","marketStartTime":"2017-09-13T21:15:00.000Z","totalMatched":0.0}]
20:46:59.966 [main] INFO main - 2017-09-13T21:15
20:46:59.966 [main] INFO main - Initializing the subscription...
20:47:00.056 [main] INFO streaming.SubscriptionManager - {"op":"connection","connectionId":"100-130917204700-67655"}
20:47:00.077 [main] INFO streaming.SubscriptionManager - {"op":"status","id":1,"statusCode":"SUCCESS","connectionClosed":false}
20:47:00.081 [main] INFO main - Subscribing to the streams...
20:47:00.119 [main] INFO streaming.SubscriptionManager - {
  "op" : "status",
  "id" : 1,
  "statusCode" : "SUCCESS",
  "connectionClosed" : false
}
20:47:00.160 [main] INFO streaming.SubscriptionManager - {
  "op" : "status",
  "id" : 2,
  "statusCode" : "SUCCESS",
  "connectionClosed" : false
}
20:47:00.207 [main] INFO streaming.SubscriptionManager - {
  "op" : "mcm",
  "id" : 1,
  "initialClk" : "yxmUlZaPE8sZsOergBPLGYz/ipIT",
  "clk" : "AAAAAAAA",
  "conflateMs" : 0,
  "heartbeatMs" : 5000,
  "pt" : 1505335620115,
  "ct" : "SUB_IMAGE",
  "mc" : [ {
    "id" : "1.134156568",
    "rc" : [ {
      "atb" : [ [ 5.1, 2.5 ], [ 1.43, 250 ], [ 1.01, 728.58 ], [ 1.03, 460 ], [ 5, 20 ], [ 1.02, 500 ], [ 1.13, 400 ] ],
      "atl" : [ [ 24, 0.74 ], [ 25, 3.55 ], [ 26, 1.96 ], [ 29, 4.2 ], [ 900, 0.02 ] ],
      "trd" : [ [ 5.3, 3.96 ], [ 5.1, 6.03 ] ],
      "id" : 13531617
    }, ... ],
    "img" : true
  } ]
}
20:47:00.322 [main] INFO main - Initializing harness for runner 13531618
20:47:00.326 [main] INFO streaming.SubscriptionManager - {
  "op" : "ocm",
  "id" : 2,
  "initialClk" : "VeLnh9sEsAa1t/neBHeqxJ3aBG7YuYDbBNYGuqHD1AQ=",
  "clk" : "AAAAAAAAAAAAAA==",
  "conflateMs" : 0,
  "heartbeatMs" : 5000,
  "pt" : 1505335620130,
  "ct" : "SUB_IMAGE"
}
20:47:00.341 [main] INFO strategy.StrategyHarness - Exposures (GBX): Back 0, Lay 0, cash 0
20:47:00.342 [main] INFO strategy.StrategyHarness - Net exposure: 0
20:47:00.343 [main] INFO strategy.CAFStrategyV2 - Net exposure 0, fraction of offer: 0.0
20:47:00.343 [main] INFO strategy.CAFStrategyV2 - Best back: (Odds(3.45),198), Best lay: (Odds(6.6),472)
20:47:00.344 [main] INFO strategy.CAFStrategyV2 - Order book imbalance: -0.408955223880597
20:47:00.351 [main] INFO strategy.StrategyHarness - Strategy wants atb, atl (Map(Odds(3.3) -> 454, Odds(3.25) -> 461),Map(Odds(7.2) -> 208, Odds(7.4) -> 202)), current outstanding orders are back, lay (Map(),Map())
20:47:00.952 [main] INFO streaming.SubscriptionManager - {
  "op" : "mcm",
  "id" : 1,
  "clk" : "ADQAFQAI",
  "pt" : 1505335620944,
  "mc" : [ {
    "id" : "1.134156568",
    "rc" : [ {
      "atl" : [ [ 990, 0 ] ],
      "id" : 13531618
    } ]
  } ]
}
20:47:00.955 [main] INFO strategy.StrategyHarness - Exposures (GBX): Back 0, Lay 0, cash 0
20:47:00.955 [main] INFO strategy.StrategyHarness - Net exposure: 0
20:47:00.955 [main] INFO strategy.CAFStrategyV2 - Net exposure 0, fraction of offer: 0.0
20:47:00.955 [main] INFO strategy.CAFStrategyV2 - Best back: (Odds(3.45),198), Best lay: (Odds(6.6),472)
20:47:00.956 [main] INFO strategy.CAFStrategyV2 - Order book imbalance: -0.408955223880597
20:47:00.956 [main] INFO strategy.StrategyHarness - Strategy wants atb, atl (Map(Odds(3.3) -> 454, Odds(3.25) -> 461),Map(Odds(7.2) -> 208, Odds(7.4) -> 202)),...

Essentially, at the start the bot subscribes to the data stream for a given market and its own order status stream. At 20:47:00.207 it receives the first message from the market data stream: the initial order book image for all runners in that market.

Before subscribing to the streams, though, the bot also gets the market metadata to find out when the race actually begins. If it receives a message and it's timestamped less than 5 minutes away from the race start, it starts market making (well, pretending to, since at that point I hadn't implemented bet placement) on the favourite at that time by polling the strategy core and just returning the bets it wants to maintain on both sides of the order book.

Every time there's an update on the market book stream, the strategy is polled again to make sure it still wants the same bets to be maintained. If there's a change, then the harness performs cancellations/placements as needed.

Bet placement: order and book stream synchronization

The first problem I noticed when trying to implement bet placement was that the order book, order status streams and the API-NG aren't synchronized. In essence, if a bet is placed via API-NG, it takes a while for it to appear on the order status stream (showing whether or not it has been executed) or be reflected in the market order book. Since the core of the bot is supposed to be stateless, this could have caused this sort of an issue:

  • the core wants to offer £2 to back at 2.00
  • the data on the order status stream implies we don't have any outstanding bets
  • hence a bet of £2 at 2.00 is submitted via a REST call (that returns a successful response containing the bet ID and some other metadata
  • a new update to the order status stream arrives that doesn't yet contain the new bet
  • the core is polled again, it still wants to offer £2 at 2.00
  • since we haven't seen our bet on the status stream, we submit another bet
  • we now have 2 unmatched bets

This could get ugly quickly and there were several ways to solve it. One could be applying the bet that had just been submitted to a local copy of the order status stream and then ignoring the real Betfair message containing that bet when it does appear on the stream, but that would mean needing to reconcile our local cache with the Betfair stream: what if the bet doesn't appear or turns out to have been invalidated? I went with a simpler approach: get the bet ID returned by the REST API when a bet is placed and then stop polling the bot and placing bets until a bet with that ID is seen on the order status stream, since only then the bot would be in a consistent state.

The resultant timings are not HFT-like at all. From looking at the logs, a normal sequence is something like:

  • T: executor submits a bet (HTTP REST request)
  • T+120ms: receive a REST response with the bet ID
  • T+190ms: timestamp of the bet according to Betfair
  • T+200ms: bet appears on the order status stream (timestamped 10ms ago)

This could be partially mitigated by doing order submissions in a separate thread (in essence applying the order to our outstanding orders cache even before we receive a message from Betfair with its bet ID) but there would still be the issue of Betfair apparently taking about 190ms to put the bet into its order book. And I didn't want to bother for now, since I just wanted to get the bot into a shape where it could place live bets.

Live trading!

I was now ready to unleash the bot onto some real markets. I chose greyhound racing and, for safety, decided to do a test with making a market several ticks away from the current best lay/back, the reasoning being that while it would test the whole system end-to-end (in terms of maintaining outstanding orders), these bets would have little chance of getting matched and even if they did, it would be far enough from the current market for me to have time to quickly liquidate them manually and not lose money.

Attempt 0

Well, before that it had some very embarrassing false starts. Remember how I operated with penny amounts throughout the codebase to avoid floating point rounding errors? I completely forgot to change pennies to pounds again when submitting the bets, which meant that instead of a £2 bet I tried to submit a £200 bet. Luckily, I didn't have that much in my account so I just got lots of INSUFFICIENT_FUNDS errors.

Speaking of funding the account, Betfair doesn't do margin calls and requires all bets to be fully funded: so say for a lay of £2 at 1.5 you need to have at least (1.5 * £2) - £2 = £1 and for a back of £2 at 1.5 you need to have £2. Backs and lays can be offset against each other: so if we've backed a runner for £2 and now have £0 available to bet, we can still "green up" by placing a lay (as long as it doesn't make us short the runner).

For unmatched bets, it gets slightly more weird: if there's both an unmatched back and a lay in the order book, the amount available to bet is reduced by the maximum of the liabilities of the two: since either one of them can be matched, Betfair takes the worst-case scenario.

Attempt 1

So I started the bot in actual live mode, with real money (it would place 15 contracts on both sides, resulting in bets of about £5 at odds 3.0) and, well, the placement actually worked! On the Betfair website I saw several unmatched bets attributed to me which would change as the market moved.

20:47:01.059 [main] INFO strategy.StrategyHarness - Exposures (GBX): Back 0, Lay 0, cash 0
20:47:01.059 [main] INFO strategy.StrategyHarness - Net exposure: 0
20:47:01.059 [main] INFO strategy.CAFStrategyV2 - Net exposure 0, fraction of offer: 0.0
20:47:01.060 [main] INFO strategy.CAFStrategyV2 - Best back: (Odds(3.45),198), Best lay: (Odds(6.4),170)
20:47:01.060 [main] INFO strategy.CAFStrategyV2 - Order book imbalance: 0.07608695652173914
20:47:01.061 [main] INFO strategy.StrategyHarness - Strategy wants atb, atl (Map(Odds(3.3) -> 454, Odds(3.25) -> 461),Map(Odds(7.0) -> 214, Odds(7.2) -> 208)), current outstanding orders are back, lay (Map(Odds(7.4) -> 202, Odds(7.2) -> 208),Map(Odds(3.25) -> 461, Odds(3.3) -> 454))
20:47:01.068 [main] INFO execution.ExecutionUtils - Trying to submit 1 placements and 1 cancellations
20:47:01.069 [main] INFO execution.ExecutionUtils - Available to bet according to us: 97919
20:47:01.073 [main] INFO execution.ExecutionUtils - Submitting cancellations: {"marketId":"1.134156568","instructions":[{"betId":102464716465,"sizeReduction":null}]}
20:47:01.191 [main] INFO execution.ExecutionUtils - Result: 200
20:47:01.192 [main] INFO execution.ExecutionUtils - {
  "status" : "SUCCESS",
  "marketId" : "1.134156568",
  "instructionReports" : [ {
    "status" : "SUCCESS",
    "instruction" : {
      "betId" : "102464716465"
    },
    "sizeCancelled" : 2.02,
    "cancelledDate" : "2017-09-13T20:47:01.000Z"
  } ]
}
20:47:01.192 [main] INFO execution.ExecutionUtils - Submitting placements: {"marketId":"1.134156568","instructions":[{"orderType":"LIMIT","selectionId":13531618,"side":"BACK","limitOrder":{"size":2.14,"price":7,"persistenceType":"LAPSE"}}]}
20:47:01.308 [main] INFO execution.ExecutionUtils - Result: 200
20:47:01.309 [main] INFO execution.ExecutionUtils - {
  "status" : "SUCCESS",
  "marketId" : "1.134156568",
  "instructionReports" : [ {
    "status" : "SUCCESS",
    "instruction" : {
      "selectionId" : 13531618,
      "limitOrder" : {
        "size" : 2.14,
        "price" : 7,
        "persistenceType" : "LAPSE"
      },
      "orderType" : "LIMIT",
      "side" : "BACK"
    },
    "betId" : "102464717112",
    "placedDate" : "2017-09-13T20:47:01.000Z",
    "averagePriceMatched" : 0,
    "sizeMatched" : 0,
    "orderStatus" : "EXECUTABLE"
  } ]
}
20:47:01.310 [main] INFO streaming.SubscriptionManager - {
  "op" : "mcm",
  "id" : 1,
  "clk" : "AEUAIgAR",
  "pt" : 1505335621264,
  "mc" : [ {
    "id" : "1.134156568",
    "rc" : [ {
      "atl" : [ [ 7.4, 0 ] ],
      "id" : 13531618
    } ]
  } ]
}
20:47:01.311 [main] INFO main - Some bet IDs unseen by the order status cache, not doing anything...
20:47:01.313 [main] INFO streaming.SubscriptionManager - {
  "op" : "ocm",
  "id" : 2,
  "clk" : "ACEAFgAJABgACw==",
  "pt" : 1505335621269,
  "oc" : [ {
    "id" : "1.134156568",
    "orc" : [ {
      "id" : 13531618,
      "uo" : [ {
        "id" : "102464716465",
        "p" : 7.4,
        "s" : 2.02,
        "side" : "B",
        "status" : "EC",
        "pt" : "L",
        "ot" : "L",
        "pd" : 1505335620000,
        "sm" : 0,
        "sr" : 0,
        "sl" : 0,
        "sc" : 2.02,
        "sv" : 0,
        "rac" : "",
        "rc" : "REG_GGC",
        "rfo" : "",
        "rfs" : ""
      } ]
    } ]
  } ]
}
20:47:01.314 [main] INFO main - Some bet IDs unseen by the order status cache, not doing anything...
20:47:01.376 [main] INFO streaming.SubscriptionManager - {
  "op" : "mcm",
  "id" : 1,
  "clk" : "AEsAIwAR",
  "pt" : 1505335621369,
  "mc" : [ {
    "id" : "1.134156568",
    "rc" : [ {
      "atl" : [ [ 7, 2.14 ] ],
      "id" : 13531618
    } ]
  } ]
}
20:47:01.378 [main] INFO main - Some bet IDs unseen by the order status cache, not doing anything...
20:47:01.380 [main] INFO streaming.SubscriptionManager - {
  "op" : "ocm",
  "id" : 2,
  "clk" : "ACMAFwALABoADA==",
  "pt" : 1505335621372,
  "oc" : [ {
    "id" : "1.134156568",
    "orc" : [ {
      "id" : 13531618,
      "uo" : [ {
        "id" : "102464717112",
        "p" : 7,
        "s" : 2.14,
        "side" : "B",
        "status" : "E",
        "pt" : "L",
        "ot" : "L",
        "pd" : 1505335621000,
        "sm" : 0,
        "sr" : 2.14,
        "sl" : 0,
        "sc" : 0,
        "sv" : 0,
        "rac" : "",
        "rc" : "REG_GGC",
        "rfo" : "",
        "rfs" : ""
      } ]
    } ]
  } ]
}
20:47:01.382 [main] INFO strategy.StrategyHarness - Exposures (GBX): Back 0, Lay 0, cash 0
20:47:01.382 [main] INFO strategy.StrategyHarness - Net exposure: 0
20:47:01.382 [main] INFO strategy.CAFStrategyV2 - Net exposure 0, fraction of offer: 0.0

So as intended:

  • the market moved to 3.45 / 6.40 best back/lay
  • the bot had backs at 7.4 and 7.2 and lays at 3.25 and 3.3, but wanted backs at 7.2 and 7.0
  • so it cancelled the back at 7.4 and then placed one at 7.0...
  • ...and then waited until the cancelled bet and the new bet appeared on the order status stream (20:47:01.313 and 20:47:01.380, respectively) before proceeding.

And indeed, the bot managed to maintain its bets far enough from the action in order to not get matched.

I ran it again, on a different market:

20:54:19.217 [main] INFO strategy.CAFStrategyV2 - Best back: (Odds(3.15),3474), Best lay: (Odds(3.2),18713)
20:54:19.217 [main] INFO strategy.CAFStrategyV2 - Order book imbalance: -0.6868436471807815
20:54:19.218 [main] INFO strategy.StrategyHarness - Strategy wants atb, atl (Map(Odds(2.98) -> 503, Odds(2.96) -> 506),Map(Odds(3.35) -> 447, Odds(3.4) -> 441)), current outstanding orders are back, lay (Map(Odds(3.6) -> 416, Odds(3.65) -> 409),Map(Odds(3.25) -> 461, Odds(3.2) -> 468))
20:54:19.219 [main] INFO execution.ExecutionUtils - Trying to submit 4 placements and 4 cancellations
20:54:19.219 [main] INFO execution.ExecutionUtils - Available to bet according to us: 97934
20:54:19.220 [main] INFO execution.ExecutionUtils - Submitting cancellations: {"marketId":"1.134151928","instructions":[{"betId":102464926664,"sizeReduction":null},{"betId":102464926665,"sizeReduction":null},{"betId":102464926548,"sizeReduction":null},{"betId":102464926666,"sizeReduction":null}]}
20:54:19.324 [main] INFO execution.ExecutionUtils - Result: 200
20:54:19.324 [main] INFO execution.ExecutionUtils - {
  "status" : "PROCESSED_WITH_ERRORS",
  "errorCode" : "PROCESSED_WITH_ERRORS",
  "marketId" : "1.134151928",
  "instructionReports" : [ {
    "status" : "SUCCESS",
    "instruction" : {
      "betId" : "102464926664"
    },
    "sizeCancelled" : 4.16,
    "cancelledDate" : "2017-09-13T20:54:19.000Z"
  }, {
    "status" : "SUCCESS",
    "instruction" : {
      "betId" : "102464926665"
    },
    "sizeCancelled" : 4.1,
    "cancelledDate" : "2017-09-13T20:54:19.000Z"
  }, {
    "status" : "FAILURE",
    "errorCode" : "BET_TAKEN_OR_LAPSED",
    "instruction" : {
      "betId" : "102464926548"
    }
  }, {
    "status" : "FAILURE",
    "errorCode" : "BET_TAKEN_OR_LAPSED",
    "instruction" : {
      "betId" : "102464926666"
    }
  } ]
}
20:54:19.325 [main] ERROR execution.ExecutionUtils - Cancellation unsuccessful. Aborting.
20:54:19.325 [main] ERROR main - Execution failure.

Oh boy. I quickly alt-tabbed to the Betfair web interface, and yep, I had an outstanding bet that had been matched and wasn't closed. I managed to close my position manually (by submitting some offsetting bets on the other side) and in fact managed to make about £1 from the trade, but what on earth happened here?

Looking at the logs, it seems like the bot wanted to move its bets once again: the market suddenly dropped from best back/lay 3.4/3.45 down to 3.15/3.2, whereas the bot had bets at 3.2, 3.25/3.6, 3.65. So the bot needed to cancel all 4 of its outstanding bets and move them down as well.

But wait: how come the best back was at 3.15 and the bot had a lay bet (meaning that bet was available to back) at 3.2? Why wasn't an offer to back at higher odds (3.2 vs 3.15) at the top of the book instead?

In fact, those two bets had been matched and that hadn't yet been reflected on the order status feed. So when the bot tried to cancel all of its bets, two cancellations failed because the bets had already been matched. The odds soon recovered back and I managed to submit a back bet at higher odds than the bot had laid (which ended up in a profit), but the order status and the order book stream being desynchronized was a bigger problem.

I decided to just ignore errors where cancellations were failing: eventually the bot would receive an update saying that the bet got matched and would stop trying to cancel it.

Attempt 2

How did we get here?

10:08:52.623 [main] INFO strategy.StrategyHarness - Exposures (GBX): Back 4599, Lay 0, cash -1501
10:08:52.623 [main] INFO strategy.StrategyHarness - Net exposure: 4599
10:08:52.623 [main] INFO strategy.CAFStrategyV2 - Net exposure 4599, fraction of offer: 3.066
10:08:52.623 [main] INFO strategy.CAFStrategyV2 - Best back: (Odds(2.74),1000), Best lay: (Odds(2.8),1021)
10:08:52.623 [main] INFO strategy.CAFStrategyV2 - Order book imbalance: -0.010390895596239486
10:08:52.624 [main] INFO strategy.StrategyHarness - Strategy wants atb, atl (Map(Odds(2.68) -> 559, Odds(2.66) -> 563),Map(Odds(2.88) -> 520, Odds(2.9) -> 517)), current outstanding orders are back, lay (Map(),Map(Odds(2.66) -> 563, Odds(2.68) -> 559))
10:08:52.624 [main] INFO execution.ExecutionUtils - Trying to submit 2 placements and 0 cancellations
10:08:52.624 [main] INFO execution.ExecutionUtils - Available to bet according to us: 96626
10:08:52.624 [main] INFO execution.ExecutionUtils - Submitting placements: {"marketId":"1.134190324","instructions":[{"orderType":"LIMIT","selectionId":12743977,"side":"BACK","limitOrder":{"size":5.2,"price":2.88,"persistenceType":"LAPSE"}},{"orderType":"LIMIT","selectionId":12743977,"side":"BACK","limitOrder":{"size":5.17,"price":2.9,"persistenceType":"LAPSE"}}]}
10:08:52.769 [main] INFO execution.ExecutionUtils - Result: 200
10:08:52.769 [main] INFO execution.ExecutionUtils - {
  "status" : "SUCCESS",
  "marketId" : "1.134190324",
  "instructionReports" : [ {
    "status" : "SUCCESS",
    "instruction" : {
      "selectionId" : 12743977,
      "limitOrder" : {
        "size" : 5.2,
        "price" : 2.88,
        "persistenceType" : "LAPSE"
      },
      "orderType" : "LIMIT",
      "side" : "BACK"
    },
    "betId" : "102489084350",
    "placedDate" : "2017-09-14T10:08:52.000Z",
    "averagePriceMatched" : 2.9,
    "sizeMatched" : 5.2,
    "orderStatus" : "EXECUTION_COMPLETE"
  }, {
    "status" : "SUCCESS",
    "instruction" : {
      "selectionId" : 12743977,
      "limitOrder" : {
        "size" : 5.17,
        "price" : 2.9,
        "persistenceType" : "LAPSE"
      },
      "orderType" : "LIMIT",
      "side" : "BACK"
    },
    "betId" : "102489084351",
    "placedDate" : "2017-09-14T10:08:52.000Z",
    "averagePriceMatched" : 2.9,
    "sizeMatched" : 5.17,
    "orderStatus" : "EXECUTION_COMPLETE"
  } ]
}
...
10:08:52.789 [main] INFO strategy.StrategyHarness - Exposures (GBX): Back 7606, Lay 0, cash -2538
10:08:52.789 [main] INFO strategy.StrategyHarness - Net exposure: 7606
10:08:52.789 [main] INFO strategy.StrategyHarness - Max exposure reached, liquidating
10:08:52.789 [main] INFO strategy.StrategyHarness - Strategy wants atb, atl (Map(Odds(2.96) -> 7606),Map()), current outstanding orders are back, lay (Map(),Map(Odds(2.66) -> 563, Odds(2.68) -> 559))
10:08:52.790 [main] INFO execution.ExecutionUtils - Trying to submit 1 placements and 2 cancellations
10:08:52.790 [main] INFO execution.ExecutionUtils - Available to bet according to us: 95589
10:08:52.790 [main] INFO execution.ExecutionUtils - Submitting cancellations: {"marketId":"1.134190324","instructions":[{"betId":102489083069,"sizeReduction":null},{"betId":102489083236,"sizeReduction":null}]}
10:08:52.873 [main] INFO execution.ExecutionUtils - Result: 200
10:08:52.873 [main] INFO execution.ExecutionUtils - {
  "status" : "SUCCESS",
  "marketId" : "1.134190324",
  "instructionReports" : [ {
    "status" : "SUCCESS",
    "instruction" : {
      "betId" : "102489083069"
    },
    "sizeCancelled" : 5.63,
    "cancelledDate" : "2017-09-14T10:08:52.000Z"
  }, {
    "status" : "SUCCESS",
    "instruction" : {
      "betId" : "102489083236"
    },
    "sizeCancelled" : 5.59,
    "cancelledDate" : "2017-09-14T10:08:52.000Z"
  } ]
}
10:08:52.873 [main] INFO execution.ExecutionUtils - Submitting placements: {"marketId":"1.134190324","instructions":[{"orderType":"LIMIT","selectionId":12743977,"side":"LAY","limitOrder":{"size":76.06,"price":2.96,"persistenceType":"LAPSE"}}]}
10:08:52.895 [main] INFO execution.ExecutionUtils - Result: 200
10:08:52.895 [main] INFO execution.ExecutionUtils - {
  "status" : "FAILURE",
  "errorCode" : "INSUFFICIENT_FUNDS",
  "marketId" : "1.134190324",
  "instructionReports" : [ {
    "status" : "FAILURE",
    "errorCode" : "ERROR_IN_ORDER",
    "instruction" : {
      "selectionId" : 12743977,
      "limitOrder" : {
        "size" : 76.06,
        "price" : 2.96,
        "persistenceType" : "LAPSE"
      },
      "orderType" : "LIMIT",
      "side" : "LAY"
    }
  } ]
}
10:08:52.895 [main] ERROR execution.ExecutionUtils - Placement unsuccessful. Aborting.

This led to another scramble with me frantically trying to close the position. In the end, I managed to make about £6 of profit from that market (accidentally going short just before it kicked off with that runner losing in the end).

As you'll find out later, this will be the most money that project Betfair will make.

So what happened here? First, the bot's back bets were being matched disproportionately often: at 10:08:52.623 it held 45.99 contracts. It was still below its maximum exposure level, so it placed a couple more back bets far away from the market at 10:08:52.769. Those immediately got matched (see "orderStatus" : "EXECUTION_COMPLETE" in the REST response), bringing the bot's exposure to above 75 contracts, so at 10:08:52.789 it decided to completely liquidate its position.

What happened next was dumb: instead of placing a lay bet of £76.06 / 2.96 (number of contracts divided by the odds), it placed a bet of £76.06. This would have made the bot have a massive short on the runner, but it didn't have enough money to do so, so instead Betfair came back with an INSUFFICIENT_FUNDS error.

Interestingly enough, if the bot did manage to close its position, it would have made money (but less): the Back 7606, Lay 0, cash -2538 part basically says it had paid £25.38 for 76.06 contracts, implying average odds of 3.00, so with a lower lay of 2.96 it would get £25.69 back, leaving it with a profit of £0.31.

Attempts 3, 4, 5...

After fixing those bugs I encountered a few more: for example, since the bot couldn't place orders below £2, it would sometimes end up with a small residual exposure at the end of trading which it couldn't get rid of. There were other issues, for example, the bot wasn't incorporating instant matches (as in when the bet placement REST request came back saying the bet had gotten immediately matched). After a couple more runs I managed to accidentally lose most of the £7 I had accidentally made and decided to stop there for now.

What I was also interested in was the fact that despite my assumptions, offers as far as 3 Betfair ticks away from the market would get matched. This was partially because the bot was slower than when I simulated it and partially because matching in greyhound markets indeed happened in jumps, but that gave me a different idea: what if I did actually simulate making a market further away from the best bid/offer?

And later on, I would come across a different and simpler idea that would mean market making would get put on hold.

Conclusion

Next time on project Betfair, we'll tinker with our simulation some more and then start looking at our data from a different perspective.

As usual, posts in this series will be available at https://kimonote.com/@mildbyte:betfair or on this RSS feed. Alternatively, follow me on twitter.com/mildbyte.

Interested in this blogging platform? It's called Kimonote, it focuses on minimalism, ease of navigation and control over what content a user follows. Try the demo here and/or follow it on Twitter as well at twitter.com/kimonote!


Copyright © 2017–2018 Artjoms Iškovs (mildbyte.xyz). Questions? Comments? Suggestions? Contact support@kimonote.com.