# The evaluation functions in ftap

Evaluation functions, or objective functions, or criteria, or benchmark functions, are used in optimizing and walk forward analysis to select to “best” back testing result. There are various evaluation functions, each one has its advantages and disadvantages. In this article I will list all evaluation functions I implemented in my ftap algorithm trading system, and I will give my opinion on the advantages and disadvantages of each function.

## The evaluation functions

### Balance

The simplest and most obvious evaluation function is the final balance at the end of back testing. It's virtual equivalent to the net profit. I myself just use the balance directly because the initial deposit is the same all the time.

Advantage: The most intuitive criteria. The goal of algorithm trading is to grow the fund, thus the larger the final balance, the better.

Disadvantage: Balance or net profit is one of the most naive evaluation function which should not be used alone. It doesn't consider any risk such as max draw down. Risk management is more important than profit.

### P/L ratio

Profit loss ratio. Formula: “Total Profit / Total Loss”.

Advantage: The larger P/L ratio means the smaller the loss, and thus the risk is smaller.

Disadvantage: Though it considers the loss, P/L ratio doesn't consider the draw down.

### Risk Adjusted Return (RAR)

Formula: “Net Profit / (initial deposit + 2 * max draw down)”

Advantage: As its name implies, RAR includes the max draw down.

Disadvantage: For the same strategy and the same initial deposit $10000, back testing A gives $10000 net profit and $5000 MDD, the RAR is 0.5. Back testing B gives $8000 net profit and $4000 MDD, the RAR is 0.44. Comparing RAR apparently A is superior. But if A only generates 30 trades each year, while B generates 100 trades each year, then is A still better than B? It's suspicious. Because 30 trades per year has higher chance of curve over-fitting than 100 trades.

### Reversed Ulcer Index

Ulcer Index measures volatility and draw down. The smaller Ulcer, the less draw down and risk. So a small Ulcer value is better than large Ulcer. Since all other evaluation value is in the reversed meaning, which larger is better than smaller, I manipulated Ulcer Index with the formula: “100 / (1 + Ulcer)”.

Advantage: Max draw down and risk is well considered.

Disadvantage: Ulcer Index can't be used as an evaluation function on it alone because it has a fatal disadvantage, it doesn't include any net profit. Our ultimate goal is to gain profit even though risk is quite important.

### PROM -- Pessimistic Return On Margin

The idea is from the book “The Evaluation and Optimization of Trading Strategies”, second edition, by Robert Pardo. FYI, I found the formula on a forum topic.

Advantage: Though PROM doesn't use max draw down directly, it includes the win/loss trade count, and profit/loss, which utilizes MDD imply.

Disadvantage: Sometimes PROM gives results with higher MDD which is not quite ideal.

### Return By Draw Down

Formula: sign(net profit)*abs( (net profit / initial deposit)*PROM*total trade count) / (100 * (0.01 + 100 * MDD%^{2})))

I can't remember if I get the idea from anywhere or I created the formula, and the formula is so complicated. Since I can't remember it, I can't give the advantage and disadvantage on it.

### RAR of PROM

Just replace the net profit with PROM in the formula of Risk Adjusted Return. It's no better than RAR, if I remember correctly.

### PROM by Ulcer

Formula: PROM / (1 + Ulcer)

Advantage: It includes the max draw down (Ulcer) in PROM.

Disadvantage: There is no best evaluation function and I don't think this is the one.

### PROM by Standard Derivation

Formula: PROM / Balance SD

Advantage: It includes the standard derivation. The smaller SD, the smoother the equity curve is.

Disadvantage: It's not better than PROM alone.

## How the evaluation functions are used in ftap

ftap optimization engine reports the data of all of the evaluation functions. The walk forward analysis engine can only use one evaluation function. I used PROM for very long time but now I'm using PROM by Ulcer.

If you have any better idea, don't hesitate to let me know so I can test.

## Discussion