In C#, the agents, circle and rectangle, start by running the Python script to initialize the corresponding agent with the important features of the environment, represented as JSON strings, as seen in the image. [IMPORTANT]: change the boolean "is_python" value for true if you are working in python.
In Python, an environment object is initialized, given the JSON strings received, being saved in the agent object created (circle and rectangle). This agent object is saved in a file using “pickle”, since we will need it when we run the script again during the execution of the agent's actions:
To execute actions and update the environment: In C#, in the update function, the Python script is called every second to obtain the current agent action (it is not possible to call the Python script every frame). As arguments, we send the updated features of the environment, to update its state before selecting a new action (as JSON strings once again) - in the random agent is not needed but it will be useful for other implementations:
In Python, the agent object is loaded, and its environment object is updated according to the arguments received in the command line. Then, the agent changes its current action and prints it, i.e., prints the index of its current action given the list of its possible moves. In this case, the agent implementation is random. - [IMPORTANT]: the print of the current action can't be removed, it is needed in order to be used in C# to update the current action variable.
Now, this code can be updated to consider the current state of the environment when deciding on the action to return, however there's a few tools that weren't considered for now: