During the last months I have been studying various behavioral aspects of generative AI models. I tried the LLM based Mathematica function ImageSynthetize, where I gave the function an image as input and the function generates a similar image as an output. During my exploration I found some interesting biases that may concern other users. This required a deeper understanding of how this function works: what are the properties of the generative AI model and which parameters are being used. Naturally my first attempt to learn about these details was through the help page of the function I found little relevant information in the formal documentation sheets. It does imply that by default the function calls a model by OpenAI company, however, this is far from being satisfactory. Specifically there is no information about the model version, which parameters are used by default (for example which "style" is selected, what is the temperature level is used etc.). This information is crucial to understand the observed behavior because there are many factors that affect the generated results.
I then tried to get some help from Mathematica support center, however, for over a month of correspondence with them, where I repeatedly asked for the most basic information about what exactly happens when calling this function, I didn't get any useful information. Without this information a lot of work and money that I already invested in this project will be lost and I will have to start all over using other, more accessible, AI platforms.
I apologize for this lengthy introduction but as a loyal user of Mathematica this was an extremely frustrating experience. I really don't understand why it is so difficult to get this kind of information. So, if anyone here has some experience or knowledge about the ImageSynthetise function I would be very grateful for any relevant information regarding the actual pipeline that is being executed when calling this function. To be more specific, when using the function using the defaults given by Mathematica, which model id being used? Which version exactly, which "style", temperature value if applicable. Is it a real image-to-image function using stable diffusion algorithm or does it create a textual prompt from the input image and then creates the output image based on this text? If it is the latter case, which AI models are being used in the different steps? Is there a way to get the textual prompt, as this could be the key factor to understand the observed behavior.