Casual Graphs

Casual Graphs

Casual Graphs

Casual Graphs

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

Humans don't drive off the edge of a cliff to learn.

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

Present deep neural networks will potentially hit a limit, because they do not have reasoning. There is no causality, there is no way to represent meaning or to represent real interaction. This is a fundamental barrier for deep neural networks and to overcome it they require something else.

A child does not need a million examples of a traffic light, after seeing only a few traffic lights most children are capable of crossing any road, in any country without getting run over.


The gap to get a machine to learn that fast and cheap is a big missing part. At DeepMoney our contribution albeit in a narrow vertical can have meaning in many other domains. Through the use of causal graph based AI, we can more effectively cope with non-static environments, in our case it is not the flow of traffic, but the flow of market data and price action on assets.

Thinking Fast and Slow

DeepMoney developed pioneering machines leveraging a thesis which is exquisitely written about by Nobel laureate Daniel Kahneman: ‘Thinking Fast and Slow’ and validated by a leading ‘Reinforcement Learning  Fast and Slow’ paper from Google Deepmind and the OpenAI team.

Whereby our patent pending models use similarity to extract episodic memory from historical market data and apply slow learning on causal AI graph models on a small selective experience.

The human brain uses memory and experience, then applies logic.

Bringing cognitive science into DeepMoney wealth machines, we considered the two deep reinforced learning (RL) methods that mitigate the sample efficiency problem:



  1. episodic deep RL

  2. meta-RL.

We examine how these techniques enable fast deep RL. As per our opening paragraph the cost of learning impacts the time in learning, thinking and decision making. In financial markets poorly timed decisions are just haphazard events.

Episodic RL, parallels ‘non- parametric’ approaches in machine learning and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology. When a new situation is encountered and a decision must be made concerning what action to take, the procedure is to compare an internal representation of the current situation with stored representations of past situations. The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

The information gained through each experienced event can be leveraged immediately to guide behaviour. This demand for small step-sizes in learning is one source of slowness in the methods originally proposed for deep RL.
The adjustments made during this form of learning must be small, in order to maximise generalisation and avoid overwriting the effects of earlier learning, an effect sometimes referred to as ‘catastrophic interference’.

Any learning procedure necessarily faces a bias–variance trade-off. A learning procedure with weak inductive bias will be able to master a wider range of patterns (greater variance), but will in general be less sample-efficient (takes longer needs more samples)

Together, these two factors—incremental parameter adjustment and weak inductive bias— explain the slowness of first-generation deep RL models

However, recent research shows that there is another way to accomplish the same goal, which is to keep an explicit record of past events, and use this record directly as a point of reference in making new decisions. This idea, referred to as episodic RL, parallels ‘non- parametric’ approaches in machine learning  and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology.

The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

Episodic deep RL is able to go ‘fast’ where earlier methods for deep RL went ‘slow,’ there is a twist to this story: the fast learning of episodic deep RL depends critically on slow incremental learning.

This is the gradual learning of the connection weights that allows the system to form useful internal representations or embeddings of each new observation. The format of these representations is itself learned through experience, using the same kind of incremental parameter updating that forms the backbone of standard deep RL. Ultimately, the speed of episodic deep RL is enabled by this slower form of learning. That is, fast learning is enabled by slow learning.

The leveraging of past experience to accelerate new learning is referred to in machine learning as meta-learning 

A recurrent neural network is trained on a series of interrelated RL tasks. The weights in the network are adjusted very slowly, so they can absorb what is common across tasks, but cannot change fast enough to support the solution of any single task. In this setting, something rather remarkable occurs. The activity dynamics of the recurrent network come to implement their own separate RL algorithm, which ‘takes responsibility’ for quickly solving each new task, based on knowledge accrued from past tasks. Effectively, one RL algorithm gives birth to another, and hence the moniker ‘meta-RL’.



As in episodic deep RL, the episodic memory catalogues a set of past events, which can be queried based on the current context. However, rather than linking contexts with value estimates, episodic meta-RL links them with stored activity patterns (such as DeepMoney master key technology) from the recurrent network's internal or hidden units. These patterns are important because, through meta-RL, they come to summarise what the agent has learned.

episodic meta-RL immediately retrieves and reinstates the solution it previously discovered, avoiding the need to re-explore. 

On the first encounter with a new task, the system benefits from the rapidity of meta-RL; on the second and later encounters, it benefits from the one-shot learning ability conferred by episodic control.

Humans don't drive off the edge of a cliff to learn.

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

Present deep neural networks will potentially hit a limit, because they do not have reasoning. There is no causality, there is no way to represent meaning or to represent real interaction. This is a fundamental barrier for deep neural networks and to overcome it they require something else.

A child does not need a million examples of a traffic light, after seeing only a few traffic lights most children are capable of crossing any road, in any country without getting run over.


The gap to get a machine to learn that fast and cheap is a big missing part. At DeepMoney our contribution albeit in a narrow vertical can have meaning in many other domains. Through the use of causal graph based AI, we can more effectively cope with non-static environments, in our case it is not the flow of traffic, but the flow of market data and price action on assets.

Thinking Fast and Slow

DeepMoney developed pioneering machines leveraging a thesis which is exquisitely written about by Nobel laureate Daniel Kahneman: ‘Thinking Fast and Slow’ and validated by a leading ‘Reinforcement Learning  Fast and Slow’ paper from Google Deepmind and the OpenAI team.

Whereby our patent pending models use similarity to extract episodic memory from historical market data and apply slow learning on causal AI graph models on a small selective experience.

The human brain uses memory and experience, then applies logic.

Bringing cognitive science into DeepMoney wealth machines, we considered the two deep reinforced learning (RL) methods that mitigate the sample efficiency problem:



  1. episodic deep RL

  2. meta-RL.

We examine how these techniques enable fast deep RL. As per our opening paragraph the cost of learning impacts the time in learning, thinking and decision making. In financial markets poorly timed decisions are just haphazard events.

Episodic RL, parallels ‘non- parametric’ approaches in machine learning and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology. When a new situation is encountered and a decision must be made concerning what action to take, the procedure is to compare an internal representation of the current situation with stored representations of past situations. The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

The information gained through each experienced event can be leveraged immediately to guide behaviour. This demand for small step-sizes in learning is one source of slowness in the methods originally proposed for deep RL.
The adjustments made during this form of learning must be small, in order to maximise generalisation and avoid overwriting the effects of earlier learning, an effect sometimes referred to as ‘catastrophic interference’.

Any learning procedure necessarily faces a bias–variance trade-off. A learning procedure with weak inductive bias will be able to master a wider range of patterns (greater variance), but will in general be less sample-efficient (takes longer needs more samples)

Together, these two factors—incremental parameter adjustment and weak inductive bias— explain the slowness of first-generation deep RL models

However, recent research shows that there is another way to accomplish the same goal, which is to keep an explicit record of past events, and use this record directly as a point of reference in making new decisions. This idea, referred to as episodic RL, parallels ‘non- parametric’ approaches in machine learning  and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology.

The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

Episodic deep RL is able to go ‘fast’ where earlier methods for deep RL went ‘slow,’ there is a twist to this story: the fast learning of episodic deep RL depends critically on slow incremental learning.

This is the gradual learning of the connection weights that allows the system to form useful internal representations or embeddings of each new observation. The format of these representations is itself learned through experience, using the same kind of incremental parameter updating that forms the backbone of standard deep RL. Ultimately, the speed of episodic deep RL is enabled by this slower form of learning. That is, fast learning is enabled by slow learning.

The leveraging of past experience to accelerate new learning is referred to in machine learning as meta-learning 

A recurrent neural network is trained on a series of interrelated RL tasks. The weights in the network are adjusted very slowly, so they can absorb what is common across tasks, but cannot change fast enough to support the solution of any single task. In this setting, something rather remarkable occurs. The activity dynamics of the recurrent network come to implement their own separate RL algorithm, which ‘takes responsibility’ for quickly solving each new task, based on knowledge accrued from past tasks. Effectively, one RL algorithm gives birth to another, and hence the moniker ‘meta-RL’.



As in episodic deep RL, the episodic memory catalogues a set of past events, which can be queried based on the current context. However, rather than linking contexts with value estimates, episodic meta-RL links them with stored activity patterns (such as DeepMoney master key technology) from the recurrent network's internal or hidden units. These patterns are important because, through meta-RL, they come to summarise what the agent has learned.

episodic meta-RL immediately retrieves and reinstates the solution it previously discovered, avoiding the need to re-explore. 

On the first encounter with a new task, the system benefits from the rapidity of meta-RL; on the second and later encounters, it benefits from the one-shot learning ability conferred by episodic control.

Humans don't drive off the edge of a cliff to learn.

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

Present deep neural networks will potentially hit a limit, because they do not have reasoning. There is no causality, there is no way to represent meaning or to represent real interaction. This is a fundamental barrier for deep neural networks and to overcome it they require something else.

A child does not need a million examples of a traffic light, after seeing only a few traffic lights most children are capable of crossing any road, in any country without getting run over.


The gap to get a machine to learn that fast and cheap is a big missing part. At DeepMoney our contribution albeit in a narrow vertical can have meaning in many other domains. Through the use of causal graph based AI, we can more effectively cope with non-static environments, in our case it is not the flow of traffic, but the flow of market data and price action on assets.

Thinking Fast and Slow

DeepMoney developed pioneering machines leveraging a thesis which is exquisitely written about by Nobel laureate Daniel Kahneman: ‘Thinking Fast and Slow’ and validated by a leading ‘Reinforcement Learning  Fast and Slow’ paper from Google Deepmind and the OpenAI team.

Whereby our patent pending models use similarity to extract episodic memory from historical market data and apply slow learning on causal AI graph models on a small selective experience.

The human brain uses memory and experience, then applies logic.

Bringing cognitive science into DeepMoney wealth machines, we considered the two deep reinforced learning (RL) methods that mitigate the sample efficiency problem:



  1. episodic deep RL

  2. meta-RL.

We examine how these techniques enable fast deep RL. As per our opening paragraph the cost of learning impacts the time in learning, thinking and decision making. In financial markets poorly timed decisions are just haphazard events.

Episodic RL, parallels ‘non- parametric’ approaches in machine learning and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology. When a new situation is encountered and a decision must be made concerning what action to take, the procedure is to compare an internal representation of the current situation with stored representations of past situations. The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

The information gained through each experienced event can be leveraged immediately to guide behaviour. This demand for small step-sizes in learning is one source of slowness in the methods originally proposed for deep RL.
The adjustments made during this form of learning must be small, in order to maximise generalisation and avoid overwriting the effects of earlier learning, an effect sometimes referred to as ‘catastrophic interference’.

Any learning procedure necessarily faces a bias–variance trade-off. A learning procedure with weak inductive bias will be able to master a wider range of patterns (greater variance), but will in general be less sample-efficient (takes longer needs more samples)

Together, these two factors—incremental parameter adjustment and weak inductive bias— explain the slowness of first-generation deep RL models

However, recent research shows that there is another way to accomplish the same goal, which is to keep an explicit record of past events, and use this record directly as a point of reference in making new decisions. This idea, referred to as episodic RL, parallels ‘non- parametric’ approaches in machine learning  and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology.

The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

Episodic deep RL is able to go ‘fast’ where earlier methods for deep RL went ‘slow,’ there is a twist to this story: the fast learning of episodic deep RL depends critically on slow incremental learning.

This is the gradual learning of the connection weights that allows the system to form useful internal representations or embeddings of each new observation. The format of these representations is itself learned through experience, using the same kind of incremental parameter updating that forms the backbone of standard deep RL. Ultimately, the speed of episodic deep RL is enabled by this slower form of learning. That is, fast learning is enabled by slow learning.

The leveraging of past experience to accelerate new learning is referred to in machine learning as meta-learning 

A recurrent neural network is trained on a series of interrelated RL tasks. The weights in the network are adjusted very slowly, so they can absorb what is common across tasks, but cannot change fast enough to support the solution of any single task. In this setting, something rather remarkable occurs. The activity dynamics of the recurrent network come to implement their own separate RL algorithm, which ‘takes responsibility’ for quickly solving each new task, based on knowledge accrued from past tasks. Effectively, one RL algorithm gives birth to another, and hence the moniker ‘meta-RL’.



As in episodic deep RL, the episodic memory catalogues a set of past events, which can be queried based on the current context. However, rather than linking contexts with value estimates, episodic meta-RL links them with stored activity patterns (such as DeepMoney master key technology) from the recurrent network's internal or hidden units. These patterns are important because, through meta-RL, they come to summarise what the agent has learned.

episodic meta-RL immediately retrieves and reinstates the solution it previously discovered, avoiding the need to re-explore. 

On the first encounter with a new task, the system benefits from the rapidity of meta-RL; on the second and later encounters, it benefits from the one-shot learning ability conferred by episodic control.

Humans don't drive off the edge of a cliff to learn.

AI technology has a very high cost of development and maintenance, reliance on big data, complex models and cloud computing. Deployment at scale to trade in hundreds of instruments need to take into consideration the cost of learning for a financially viable business model. 

Present deep neural networks will potentially hit a limit, because they do not have reasoning. There is no causality, there is no way to represent meaning or to represent real interaction. This is a fundamental barrier for deep neural networks and to overcome it they require something else.

A child does not need a million examples of a traffic light, after seeing only a few traffic lights most children are capable of crossing any road, in any country without getting run over.


The gap to get a machine to learn that fast and cheap is a big missing part. At DeepMoney our contribution albeit in a narrow vertical can have meaning in many other domains. Through the use of causal graph based AI, we can more effectively cope with non-static environments, in our case it is not the flow of traffic, but the flow of market data and price action on assets.

Thinking Fast and Slow

DeepMoney developed pioneering machines leveraging a thesis which is exquisitely written about by Nobel laureate Daniel Kahneman: ‘Thinking Fast and Slow’ and validated by a leading ‘Reinforcement Learning  Fast and Slow’ paper from Google Deepmind and the OpenAI team.

Whereby our patent pending models use similarity to extract episodic memory from historical market data and apply slow learning on causal AI graph models on a small selective experience.

The human brain uses memory and experience, then applies logic.

Bringing cognitive science into DeepMoney wealth machines, we considered the two deep reinforced learning (RL) methods that mitigate the sample efficiency problem:



  1. episodic deep RL

  2. meta-RL.

We examine how these techniques enable fast deep RL. As per our opening paragraph the cost of learning impacts the time in learning, thinking and decision making. In financial markets poorly timed decisions are just haphazard events.

Episodic RL, parallels ‘non- parametric’ approaches in machine learning and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology. When a new situation is encountered and a decision must be made concerning what action to take, the procedure is to compare an internal representation of the current situation with stored representations of past situations. The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

The information gained through each experienced event can be leveraged immediately to guide behaviour. This demand for small step-sizes in learning is one source of slowness in the methods originally proposed for deep RL.
The adjustments made during this form of learning must be small, in order to maximise generalisation and avoid overwriting the effects of earlier learning, an effect sometimes referred to as ‘catastrophic interference’.

Any learning procedure necessarily faces a bias–variance trade-off. A learning procedure with weak inductive bias will be able to master a wider range of patterns (greater variance), but will in general be less sample-efficient (takes longer needs more samples)

Together, these two factors—incremental parameter adjustment and weak inductive bias— explain the slowness of first-generation deep RL models

However, recent research shows that there is another way to accomplish the same goal, which is to keep an explicit record of past events, and use this record directly as a point of reference in making new decisions. This idea, referred to as episodic RL, parallels ‘non- parametric’ approaches in machine learning  and resembles ‘instance-’ or ‘exemplar- based’ theories of learning in psychology.

The action chosen is then the one associated with the highest value, based on the outcomes of the past situations that are most similar to the present.

Episodic deep RL is able to go ‘fast’ where earlier methods for deep RL went ‘slow,’ there is a twist to this story: the fast learning of episodic deep RL depends critically on slow incremental learning.

This is the gradual learning of the connection weights that allows the system to form useful internal representations or embeddings of each new observation. The format of these representations is itself learned through experience, using the same kind of incremental parameter updating that forms the backbone of standard deep RL. Ultimately, the speed of episodic deep RL is enabled by this slower form of learning. That is, fast learning is enabled by slow learning.

The leveraging of past experience to accelerate new learning is referred to in machine learning as meta-learning 

A recurrent neural network is trained on a series of interrelated RL tasks. The weights in the network are adjusted very slowly, so they can absorb what is common across tasks, but cannot change fast enough to support the solution of any single task. In this setting, something rather remarkable occurs. The activity dynamics of the recurrent network come to implement their own separate RL algorithm, which ‘takes responsibility’ for quickly solving each new task, based on knowledge accrued from past tasks. Effectively, one RL algorithm gives birth to another, and hence the moniker ‘meta-RL’.



As in episodic deep RL, the episodic memory catalogues a set of past events, which can be queried based on the current context. However, rather than linking contexts with value estimates, episodic meta-RL links them with stored activity patterns (such as DeepMoney master key technology) from the recurrent network's internal or hidden units. These patterns are important because, through meta-RL, they come to summarise what the agent has learned.

episodic meta-RL immediately retrieves and reinstates the solution it previously discovered, avoiding the need to re-explore. 

On the first encounter with a new task, the system benefits from the rapidity of meta-RL; on the second and later encounters, it benefits from the one-shot learning ability conferred by episodic control.

©2024 Deepmoney · All rights reserved.

©2024 Deepmoney · All rights reserved.