¿Cómo hay que hacer para optimizar a Fat Fritz? - ¡Ahora es el "número uno de módulos"!

por Albert Silver
25/11/2019 – El día 23 de noviembre, se publicó el nuevo escalafón de los mejores módulos de ajedrez del mundo (Computer Chess Rating List). ¡Fat Fritz es el nuevo número uno! El módulo ahora ya está disponible para todo el mundo y está incluido en nuestro flamante programa de ajedrez Fritz 17. Muchos usuarios nos han preguntado es si deben cambiar algo en la configuración de su programa y de ser así, qué sería, para poder sacar el máximo rendimiento del módulo con una tarjeta gráfica estándar y cómo sería en un sistema con dos poderosas tarjetas gráficas de la última generación. Albert Silver nos explica cómo en su artículo (en inglés).

Fritz 17 Incluye el módulo FAT FRITZ Fritz 17 Incluye el módulo FAT FRITZ

Fritz 17 es la nueva edición de aquel mismo programa de ajedrez Fritz que ha fascinado al mundo del ajedrez desde hace unos 25 años (¡!): las victorias de Garry Kasparov y de Vladimir Kramnik; los métodos innovadores y modernos de entrenamiento para jugadores aficionados y profesionales; ajedrez cibernético en el servidor de Fritz, etc. Fritz es “el programa de ajedrez más popular de Alemania” (Der Spiegel) y ofrece todo lo que necesita el ajedrecista. La novedad más espectacular: Fritz 17 incluye el módulo basado en una red neuronal de inteligencia artificial, "Fat Fritz".

Más...

A new paradigm

CCRL, the Computer Chess Ratings List, is the one of the oldest and longest running chess engine ratings lists in activity, going strong for 15 years now. You can find full tests of engines going back to Chess Tiger 2004! Following the times, they also now include testing with graphics cards to test neural networks, which includes not just Leela and Fat Fritz, but also Allie, and Stoofvlees. But it's Fat Fritz, running with an RTX2080 GPU that tops the current (November 23rd) list. While we don't believe that engine vs engine competition is the best reason to use our new flagship engine — its strengths as an effective analysis partner go beyond mere rating — it's still a valuable independent metric.

CCRL standings

November 23rd results matrix, top 12 engines (best versions only)

The website is very sophisticated with an impressive range of filters, details and statistics. By default, it will display only the best versions of any specific engine, but if you click on the Complete List link, it will show every version and hardware setup used. 

A breakdown of the individual match results can be found at the bottom of the list, or clicking on each entry. All games are available for download.

Getting the most performance from your engine

One of the most fascinating aspects of the neural networks is their reliance (for now) on a graphics processor (GPU) to achieve best results. For some, this is considered 'unfair' when comparing to a conventional chess engine such as Stockfish, Komodo, and so many more, since it gets a Central Processing Unit (CPU) and a GPU. However, this is a very misleading way of describing what goes on. The reason is two-fold:

  1. The GPU isn’t used to actually do any calculations. While the search is conducted on the CPU just like any engine, the huge weights file, containing millions of values which make up its understanding of chess, is read by the GPU for each node to give its evaluation. Even with a top-of-the-line GPU, only two cores are used to run the search. More will actually hurt speed, since the bottleneck is really how fast the GPU can read the neural network file, and not the CPU’s search calculations.
  2. The classic engine can certainly use two CPU cores, or four, or 32, or 128. Every added core just increases the speed of its calculations. Fat Fritz and Leela gain nothing from mountains of additional CPU cores.

As a result, saying the neural network has an advantage thanks to the GPU is incorrect since it cannot benefit from many CPU cores as conventional engines can. It is simply different.

Understanding that a conventional engine is different from a neural network is fine, but how can one compare them in a balanced situation? Sadly the answer is not straightforward.

How to compare: The AlphaZero ratio

When DeepMind published their paper on AlphaZero and the results against Stockfish 8, they gave a crucial piece of information that, at the time, probably seemed a curiosity at best: the ratio. All the results they published were based on two things: the time control, and above all the overall speed advantage in nodes per second that Stockfish had: 900 times more nodes per second (on average). 

Source: DeepMind

Why was this important? Because when neural networks came to the PC as a practical reality to reproduce AlphaZero, the only way to compare results was to set up a match that used the same conditions (around 63,000 vs 58,100,000). This means that if Fat Fritz or Leela is running at 10,000 nodes per second, Stockfish should be running around 9,000,000 nodes per second. Tip the balance too much one way or the other and you will then be clearly favouring one side over the other. 

The Diesel Dragster

One curiosity that was also shared by DeepMind in their first pre-paper, was how AlphaZero relied on a minimum depth to really show its strength.

Performance of AlphaZero and Stockfish, plotted against time per move. (source: DeepMind)

In a nutshell it shows that at very shallow depths, even AlphaZero lost and lost badly to Stockfish, but right around 30 thousand nodes per move it broke even, and only pulled ahead beyond that. Please note that was nodes per move, not per second. For AlphaZero, on their impressive hardware, that meant about half a second of analysis, but on a slower GPU that might mean more time was needed. Fat Fritz, for all its wonderful creativity has shown itself to illustrate this in the spades. GM Moradiabadi has noted how principled Fat Fritz’s play was, meaning that if the position called for material sacrifices for activity, it would not hesitate, even if this led to sharp double-edged positions. It is perhaps for this reason that results in very shallow games (or weak hardware) can lead to disappointing scorelines, as the complications do need a certain amount of calculations to be resolved. 

However, if you do take these factors into consideration, you can get superlative results. Here is a match played in 100 games against Stockfish Dev (November 6 build) using the TCEC 16 openings, in 10-minute games with a 5-second increment.

The hardware was such that Stockfish ran at around 27 million nodes per second (based on start position) on 32 threads, while Fat Fritz (a newer build) was running at 30 thousand nodes per second, for a perfect 900 to 1 ratio as explained above. This also means that for each move, Fat Fritz was getting a minimum of 400 thousand nodes. Remember that AlphaZero was run in far longer games reaching easily 30 million nodes per move, with Stockfish getting 900 times that.

 

So why ‘Diesel Dragster’? The idea was to convey a racer that may be slow to get to speed, but that has a fantastic top speed once it gets there.

Optimal configurations

If you look at the internal settings of both Fat Fritz and Leela, you will notice many values that are completely different, such as the CPUCT, CPUCT factor, and so on. These values are the result of a deeply automated process known as CLOP, which helps determine the best performing values for an engine. These will vary from engine to engine, or in this case from neural network to neural network, so do not assume that one set of best values will work on another. They might easily cripple the other neural network instead of help it

Still, as a small secret shared now with the readers: the tuning process was extended to a full 170 hours for Fat Fritz, and its new optimal values are slightly different than the ones that came with the first release. While they will be included in the next engine update, feel free to use them now:

Just right-click on the engine pane, and select Properties to open the UCI options

Change the cpuct to 3.56 (instead of 3.67), the cpuctfactor to 2.74 (instead of 2.54), and the Policy Temperature to 1.84 (instead of 1.87). You can see these values and where they are above.

Nvidia 16xx video cards

If you own a machine that has one of the newer Nvidia mid-range cards such as the GTX1650, GTX 1650ti, GTX1660 or GTX1660ti, you can enjoy a nice speed boost by changing the UCI option called the Backend to cudnn-fp16. Be sure to leave the number of threads to 2 though. More than that will hurt performance.

Multiple GPUs

One scenario that was not covered at all in the installation process is the matter of multiple GPUs. There is no question you can get maximum results with more than one GPU, and contrary to games in the past which required a special connection linking the two, all you need is to have them installed, and make a few changes in the UCI options. 

The exact changes are:

Backend — Multiplexing
Backend Options — (backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)
Threads — 4

Conclusion

Hopefully you will now find yourself armed with the means to get the most out of your Fat Fritz or Leela, and what to expect.

For more, check out all stories on Fat Fritz and the new Fritz 17.


Fritz 17 Incluye el módulo FAT FRITZ

Fritz 17 es la nueva edición de aquel mismo programa de ajedrez Fritz que ha fascinado al mundo del ajedrez desde hace unos 25 años (¡!): las victorias de Garry Kasparov y de Vladimir Kramnik; los métodos innovadores y modernos de entrenamiento para jugadores aficionados y profesionales; ajedrez cibernético en el servidor de Fritz, etc. Fritz es “el programa de ajedrez más popular de Alemania” (Der Spiegel) y ofrece todo lo que necesita el ajedrecista. La novedad más espectacular: Fritz 17 incluye el módulo basado en una red neuronal de inteligencia artificial, "Fat Fritz".

Más información...




Editor y escritor de la página de ChessBase de noticias en inglés. Vive en Río de Janeiro (Brasil)
Discussion and Feedback Join the public discussion or submit your feedback to the editors