Sunday, March 31, 2019

How I made my AlphaZero experiment 8x faster.

To learn AlphaZero, I decided to apply it to one of my favorite board game.

Along the way, I discovered many great examples that helped me learn about reinforced learning.


Once my first implementation proved to be functional, I did a quick bit of math to predict how long training my neural network would take :

  • 1000 training iteration consisting of :
    • 100 learning games
    • 1 neural network training pass
    • 50 arena games to measure improvements
  • A game consists of : 
    • ~ 400 decision per game
    • 1000 Monte Carlo simulation per decision
    • 1 neural network call per simulation
    • 1 millisecond per neural network call
  • Neural Network training is done over :
    • 200,000 training examples
    • Repeated over 10 epoch
    • 10 milliseconds per training example

Which adds up to : 2.54 years !


Batching to the rescue

Training batch size

The first improvement I did was to increase the neural network training batch size. Going from a batch size of 1 to 5000 made training about 100 times faster to get the same job done. That's 230 days saved.

Neural network call batching

With training batch size getting such a huge impact on performance, I wondered : could neural network calls be batched as well ?

A few experiments later, it turned out that batching neural network calls also yields good benefits.
  • Batch sizes of 100 are 20 times faster than individual calls
  • Batch sizes of 1000 are 40 times faster than individual calls

The first problem : sequential algorithms

Monte Carlo simulations do not lend themselves very well to batching. Starting with a current state, they explore a decision tree until a new state is encountered. They then query the neural network to analyze the new state. This causes a problem as every call to the neural network depends on the result of previous calls.

The solution ? By running many games in parallel, it's possible to run each game up to a neural network call and then batch all the calls together.

The second problem : recursive algorithms

The Monte Carlo algorithm I had was recursive and didn't lend itself to injecting a pause in the middle of the algorithm execution.

The solution : Refactor the algorithm from a recursive call to a loop.

The third problem : memory

Most Monte Carlo algorithms I saw keep a history of all the moves they encountered during a game, allowing them to reuse states they've seen before effectively. When running 100 games in parallel, each leading to about 400 decision, each doing 1000 simulations, the program needs to hold 40 million game state statistics. At 1000 bytes per game state, it adds up to 40 gigabytes of required space.

The solution : Purge old data. While it's convenient to remember all the game states, the older states are rarely reused in most games. This can be done by keeping a counter on the last time a game state was reused. Once the game state gets too old, it gets purged.

The results : 8x faster

While running a single game is now significantly slower, running a 100 games in parallel is 8x faster. The average time to run a neural network evaluation went from 800 microsecond to 80. 

Both before and after the change, my 4 CPU cores were running at 95%.

Total execution time is now 58 days instead of 1.9 years. Not too bad for running the same job, on the same hardware. And all the multi threaded complexity is contained in Deeplearning4j.

Next steps

To speed things up, there are a few things to explore.  So far, switching to CUDA cores on my GPU actually slowed things down as I'm mostly running small scale dense layers. CUDA cores work wonders for neural network convolutions that operate on a fixed grid pattern, but not all games operate on that concept. In some board games, a single tile may have eight neighbors, another only one.

Running across multiple computers in parallel would definitely help as well.

If you have any other tips to speed things up, let me know in the comments!




Tuesday, December 31, 2013

5 Levels of REST API Response Statefulness

Stateless : A Best Practice That Isn't Absolute


One of the best practices to follow when designing RESTful APIs is to make the requests stateless.

In simple terms, it means that an API request should contain all the parameters it needs for the server to execute it without depending on the result of a previous request.

In theory, stateless requests brings the following benefits:

  • Scaling: Any server in a cluster can handle the request.
  • Simplified Caching: It's easier to cache the result of a request by using the explicit request state as a cache key.
  • Simplified Troubleshooting: When parts of the request the state aren't explicitly stated, additional investigation is required to figure it out.
  • Reduced Learning Curve: Each API request is self contained. To call the required API, developers don't need to learn or use multiple APIs in sequence.

Following this best practice can be quite challenging since the software requirements sometimes conflict with it. For instance, any API that requires authentication usually relies on one API request to authenticate the user in order to supply valid credentials to subsequent requests.

Since this best practice isn't entirely "black" or "white", I found that defining different levels of statefulness helps when I need to develop new APIs or explain the concept.

Statefulness Levels


  1. Permanent Data
  2. Changing Data
  3. User Specific Data
  4. Session Specific Data
  5. Random Behavior


Level 1: Permanent Data

The request passes explicit state data, the response will always be the same for a specific state.

  • Permanent stateless resources: A data request done on a read-only data warehouse that stores historical information will never change.
  • Permanent resources with a state: A resource may be delivered in different languages, the language selection is explicitly in the request so that the following requests will yield a potentially unique result.
    Example:
    • http://server.com/product/1234&language=french
    • http://server.com/product/1234&language=english


Caching those responses is trivial.

Level 2: Changing Data

The request passes explicit state data,  the response may change over time, even if the state combinations are the same.

Examples:

  • Files: A picture stored at: http://server.com/picture.bmp , that picture might change over time and might eventually be taken offline.
  • Dynamic Data: A business object stored at : http://server.com/product/1234 , that specific object might change over time and might eventually be deleted.


Caching those responses is relatively easy. The challenge lies in defining caching rules to ensure that entries get invalidated in a reasonable time frame.

Level 3: User Specific Data

The response may look different based on the user performing the request.

  • Access Control: Different users might get a different result for the request.
    Example: Given http://server.com/product/1234, if some users don't have access to it, they will receive a "403 Forbidden" response while the other users with access will receive the information on the object.


  • Hidden Data: The response data is based on information hidden from a user.
    Example: The API http://server.com/publicity is used to retrieve the publicity you see on a page, the content of this publicity usually depends on information associated with your user account. This information isn't available to the client performing the request.

  • Personal URL: A URL for which the response is bound to be different between users.
    Example: The URL http://server.com/my-cart. The response data will change based on the user performing the request.

    Note: this can be avoided if the URL could explicitly state the cart identifier in the request such as : http://server.com/cart/5678. This way, API developers can deal with more than one cart. Users might also be able to share their cart with other users.
  • User Preferences : An API that leverages user preferences stored on the server.
    Example: Users store their preferred language on the server. When different users perform the request http://server.com/product/1234 they each get a different result based on their preferred language.

    Note: This can be avoided by tracking the state on the client side.  The client application can read the user preferences and then use them explicitly in every request:
    • Load the current user preferences at http://server.com/user/4567/preferences, returns language=french
    • Reuse the loaded preference in the request http://server.com/product/1234&language=french


Caching those responses is more difficult. On the server side, extra steps are required to determine the correct cache key to use. Edge caching can be impossible since the edge servers won't have all the data they need to return the right response.



Level 4: Session Specific Data

The response data may look different based on the user session performing the request.

  • Session Preferences: The server stores user preferences based on their session.
    Example: A user logs in twice to the same server using two different applications. He configures one application to be French and another in English. The user then performs the request http://server.com/product/1234 on both application and gets a different result.

    Note: this can be avoided by making the client application track the selected language state.


Caching those responses is as difficult as those of level 3.


Level 5: Unpredictable/Random Behavior

The response may look different, even if executed with the same parameters with the same user in the same session.

  • Random Behavior: The behavior isn't predictable.
    Example: An API that returns a random number.




Lower is better

With those levels, the goal is to reach the lowest possible level to gain the benefits of statelessness while satisfying the application requirements.

I hope you'll find them as useful as I did.


Sunday, September 23, 2012

The missing 'S' in MVC

MVC (or as I prefer : MCV ) proved to be a very useful design pattern for compiled applications. Using this pattern, a developer could create a tool to visualize how the application is working internally. Each layer would have a dedicated part presenting the current state of each layer.

Model Controller Views

What about dynamic schemas ?

A problem with this pattern starts to emerge once application logic can be changed while the application is running.

Suddenly, three layers aren't enough. A schema layer needs to be inserted somewhere in the pattern if we want to visualize parts such as the model definition or the rules defined in a controller. The role of the schema layer is to contain the definition of object structures and the logic that govern them. In other words; the class and the method definitions.

Adding this layer at the start of the pattern would lead to SMCV.

Schema Model Controllers Views


While functional, this layer representation has a few dependency problems : It implies that the layers above the schema layer can depend on all the content in the schema. For instance, the model can depend on controller or view definitions.

The solution to all software problems : Add more layers !

To clean up dependency cycles, a better solution is to split the schema in a dedicated layer for the model, the controller and the view.

The only problem left is the terrible acronym for the pattern : SMSCSV.

Schema Model Schema Controllers Schema Views

This is the standard layers that ended up forming my typical applications on the Deduced Framework.

Typical layers of a Deduced Application

This view allows multiple users or developers to collaborate on a running application to monitor and apply change on any layer, demonstrating another interesting side effect that dynamic schemas introduce on traditional software design patterns.

Thursday, July 12, 2012

RESTful MVC with a twist


The MCV layers

Smelly MVC

There are multiple approaches to the MVC design pattern, but few of them ever felt right to me.

One of my main gripes with the pattern described on wikipedia is the dependency cycle between the Controller layer and the View layer. Specifically:

  • A controller can send commands to its associated view.
  • A view must communicate to the controller to activate commands.

Dependency cycles are usually a design smell. While something seems wrong, the answer isn't always obvious.

Another problem arises on the controller layer where the code is usually a mix of commands used to manipulate the model and commands to manipulate the view. I can't help but think that those commands don't belong in the same layer.

The V and C switcharoo

The solution I apply to my designs is to switch the View and Controller layer order, leading to Model Controllers Views or the MCV acronym.

The extra 's' aren't there by mistake, but I'll get to it in a moment.

To make it work, I apply a few extra restrictions.

The Controller layer contains no "View" concepts or commands. It is now responsible to modify the model consistently and also to apply security to any changes being done.

The View layers become a bit more thick. They now contain the required logic to build the visual components and interact with their associated control layer.

A new twist comes from the Controller layers that now generates change events consumed by the View layers. This is necessary for model changes that might trigger controller changes. It also adds the benefit that a developer could modify a control layer while the application is running, and the view would keep reacting appropriately.


A RESTful of shmuck


The constraints imposed by REST usually make most developers queasy. How the hell can we make APIs without context ? How can we accomplish every possible change using only PUT, POST and DELETE?

When it comes to monitoring the controller layer, the context restriction hits straight in the teeth. Most controllers need their context to know which user is executing a command. So how do you offer a REST API to monitor a control layer?

The answer : create a control layer instance for each context.

A sample MCV deployment with multiple instances.
Each user gets their instance of a control layer. Just like every view on a system gets a different view layer instance.

This means a RESTful API can be built to fetch each control layer independently. An administrator or a developer could query the control layer of a specific user to ensure that everything is behaving as expected.

Yo dawg, I heard you like MCV...

...so we put a MCV layer in your MCV layer.

If we have a RESTful API on top of our model, our controllers and our views, what stops us from wrapping all of those things together as a single model? Slap development control layers and development view layers on top of it all and use them to monitor, modify and debug all the layers of a running application?

Besides laziness, nothing makes this impossible.

An application MCV wrapped in a development MCV

I applied this in my pet project : The Deduced Framework. It turned out to be quite convenient; allowing the framework to offer a development environment where everything could be changed while the application is running. Whether it's a view, a controller or a model; everything is laid out in plain view. Developing an entire application without ever stopping it to compile is quite a peasant feeling.

It feels like flirting with the future of development tools, where the idea of stopping an application to compile will be as obsolete as manually managing memory allocation or keeping track of the registries on a CPU.

- Steve McDuff


Friday, July 6, 2012

How should object oriented software work ?

The Lego  Motorized Excavator

Building Blocks

The first time I learned the concept of object oriented, I had an epiphany; I envisioned building software the same way I assembled Lego blocks as a kid. Data and logic blocks would interconnect seamlessly to form applications quickly and efficiently. I would write a piece of code once and I'd be able to reuse it in almost any context. Gone would be the days of duplicated code.

Back to Reality

Reality turned out to be a tad more complex.

C++ allowed multiple inheritance in multiple ways using the optional virtual inheritance. This lead to the creation of new problems and their corresponding solutions.


To simplify things, some languages enforced limitations on how object inheritance would work. Java enforced a single inheritance hierarchy, using interfaces to expose multiple functional aspects.

Data protection concepts emerged to prevent the outside world from modifying internal data. Getters and Setters emerged as best practice to encapsulate data.

Extending existing objects ranges from very easy to impossible depending on whether or not the author of an object anticipated the need to override a specific aspect of his design.

My Ideal World

In an ideal world, I'd like to be able to assemble a class definition by pulling multiple aspects together to form a class definition.

Small data aspects to identify object fields that are common.

Small logic aspects that would pull data together in a reusable fashion.

Experiments

When I started experimenting with the Deduced Framework, one of my goals was to create an easy way to create data structure. Multiple inheritance was a key piece of functionality that allowed users to define those structures and assemble them.

When I added deduction rules and actions to the schema, it soon became apparent that logic could also be encapsulated in small aspects, provided that a few simple rules were followed.


Object inheritance had to occur in a predictable fashion.
For instance, if a type inherited from 3 other types, the order in which each overrides applies had to be well defined.


Object inheritance had to remain simple.
In C++ terms, all inheritance would be virtual. The idea of non-virtual inheritance where you might get more than one instance of a specific field usually adds more confusion than value.


Data and Logic had to be encapsulated in small units
If too many pieces of data or logic are contained in a single type, reusing parts of this type becomes an all or nothing type of deal. You either get all the data and logic or you get none. The idea of separating data from logic in different types also brings values as not all the users of a type will want to reuse the same logic.

Applying those principles became quite handy while defining the data structure used to represent parts of a user interface.

Defining the Data


For instance, when defining the concept of a visual tree, I defined the "Tree Item" type using four data aspects:

  • Named Collection : To apply a name to my tree item.
  • Component Container : To store the visual component used in the tree item.
  • Styled Collection : To apply a visual style to the tree item.
  • Tree Item List Container : To define the list of child tree nodes.

The definition of the "Tree" type reuses 3 of those 4 aspects : "Named Collection", "Styled Collection" and "Tree Item List Container". Only the "Component Container" wasn't applicable.

None of those types contained any logic to maximize reusability.

Defining the Logic


One of my first requirement in defining the logic of my "Tree Item" was to build it based on a configuration. The type "Configured Collection" indicates that aspects of the current object are derived from a configuration. I therefore defined 4 new logic types to configure each data aspect.

  • Configured Name : To apply a name to my tree item based on a configuration.
  • Configured Component Container : To build the component within the tree node based on a configuration.
  • Configured Style : To apply a visual style to the tree item based on a configuration.
  • Configured Tree Item List Container : To create the list of child tree nodes based on a configuration.


Resulting object hierarchy


Benefits

One might argue that a hierarchy composed of 11 different types is complex. However, I believe that complexity in software is usually driven by what we want to accomplish with it. I'd rather accomplish a complex task by aggregating 11 simple concepts together than by creating a few types where the same 11 concepts are regrouped together.

Reusability is also greatly enhanced. When I need to create a new "Tree Item" type where only parts of the fields are driven from a configuration, I can compose a new type by inheriting only the configuration aspects I need.

Best of all, I feel like I'm building software by linking blocks together once again.

Thursday, April 19, 2012

How Deduction Rules reduced my code size by 90%

Less code to write, less code to test


I don't think anyone needs to be convinced that writing less code to reach the similar levels of functionality is beneficial.

Fewer lines of code usually results in few tests to run and faster development time.

In this post, I'll explain how much impact the Deduced Framework had on my code size.


The Observer Pattern


I've always been a fan of the observer pattern. It consists of model objects that notify listeners when they change.

For instance, if a model is displayed in multiple widgets in a user interface, then each widget would listen to the model and update themselves when a change happens.

Besides ensuring that the application remains consistent, it also encourages good software layering practices like Model View Controller (MVC) where a view could listen to a model without causing the model to depend on the view's implementation.


Simple Use Case : Adding Two Values


Another usage of the observer pattern is to manipulate data based on changing inputs.

For instance, if an application has a model with the following 3 fields:
  • Money in my Wallet
  • Money in my Bank Account
  • Total Money Available
We could then leverage the observer pattern to listen to the wallet and bank account fields.


// extend AbstractObservable to manage listeners and notify them.
// implement Listener to listen to property change events
public interface MyMoney extends AbstractObservable implements Listener{
private int wallet;
private int bank;
private int total;

public MyMoney(){ // constructor
super(); // initialize AbstractObservable 
addListener(this); // listen to changes on self
}

public void propertyChanged(ChangeEvent e){
// verify if the field we care about changed
if( e.getField().equals("wallet") || 
   e.getField().equals("bank") ) {
  // detected a change that has impact on the 
  // total field, recalculate it
calculateTotal();
}
}

// update the total money available field
public void calculateTotal(){
total = bank + wallet;
}
...
}


Things work well for simple scenarios, but they get complicated when required inputs need to be fetched throughout an object tree.

The code then need to manage listeners across a network of object and react to every change that occurs.

For instance, if the model wants to listen to the money value stored in a list of bank accounts, it would need to:

  • Add a listener to the list when it's set.
  • Remove the listener on the old list if it gets changed.
  • Add a listener to objects added to the list
  • Remove listeners on objects removed from the list


Spreadsheets do this without breaking a sweat...


Now compare the same feature implementation on a spreadsheet.


The only formula required here is :

B3=B2+B1

In one line, the formula defines it's inputs, listen to them and executes itself when required.

Compared to the ~7 lines of code written above, it's easy to see how much time and code is saved.

Yet, for all their benefits, spreadsheets have their drawbacks. Every time you need a new instance of a formula, it needs to be copied. It's an obvious violation of the Don't Repeat Yourself (DRY) principle. If it turns out there is a defect in the formula, the fix needs to be applied everywhere it was copied.


Deduction Rules


The Deduced Framework attempts to bring the efficiency of spreadsheet formulas into the realm of object oriented development using Deduction Rules.

They consist of the following parts:

  • a list of inputs
  • a single output
  • the java code to execute

In the example above, a deduction rule could have the following values:

  • walletInput : a link to the wallet property
  • bankInput : a link to the bank property
  • totalOutput : a link to the total property
  • Code return walletInput + bankInput; 


Once defined, the rule will automatically listen to it's inputs, execute the code when they change and update the output value.

We are now back to 1 line of code. On par with spreadsheets. Yet we still have the benefits of object oriented programming where we can create multiple instances of the object without creating multiple copies of the rule definition.


A Bigger Example : Building a Tree


The first time I implemented a visualization tree, I did it with the observer pattern. The code maintained a network of listeners across the model it wanted to represent. It reacted to all possible events that might have an impact on the view. It took about 2000 lines of code to accomplish it's function.

Once deduction rules were introduced, I used them to create a new implementation of the visualization tree. Rules were used to create child tree nodes, set their display text and set their icon. It took only 50 lines of code to accomplish the same function.

That's 40 times less code !

The number of defects I found in the code was proportional to the number of lines. The code is also much easier to learn and extend.

If you're curious to learn more about deduction rules, I encourage you to read on the Deduced Framework, post a comment below or send me an e-mail at d-duffATusersDOTsourceforgeDOTnet.






Sunday, April 15, 2012

Deduced Framework 2.1 Released

I just released version 2.1.0.0 of the Deduced Framework.


With it, almost anyone can build quick data manipulation applications and share it on the web.