Amazon Bedrock’s “Sorry, I’m unable to assist you with this request” solved: a journey into...
15 January 2025 - 11 min. read
Matteo Goretti
DevOps Engineer
In the last few years, we have witnessed an explosion of interest in Artificial Intelligence (AI) in all its forms.
Tools like ChatGPT, Bard, Claude, and Bing AI have made this technology accessible to a vast audience, highlighting its potential, especially in Generative AI.
Many companies have already ventured into commercial applications. Some examples are the personalization of corporate and non-corporate content, the generation of highly targeted and personalized promotional materials, the production of technical documentation, the real-time translation of texts, and even the improvement of Help Desk and customer support systems.
The hype generated, however, triggered a "GenAI rush," causing the so-called fear of missing out (FOMO): It is more and more common to fit Generative AI to the business problems, reversing the cause-effect relationship with a serious risk of incurring considerable investments just for the sake of it.
Despite the extraordinary possibilities offered, it is important to consider that Generative AI is not always the solution to all business challenges: it is necessary to evaluate its use from multiple perspectives to understand whether it can truly offer a competitive advantage. This advantage should be unique, tangible, and valuable enough to be perceived and justify the effort and investment, which are still very high today.
Let's focus on a real-world use case.
In this article, we'll start discussing the reason why we chose Generative AI as the best tool to maximize results and retracing all the valuable lessons we learned during the project implementation.
For a client of ours, a player in the food industry, we faced the need to develop a planning and recommendation engine. Starting from an input carrying personalized criteria, the solution autonomously composes a daily and/or weekly menu, planning meals by selecting dishes based on user preferences, available credits, and users' diet needs.
The main challenges were:
The goal was twofold: on the one hand, to increase customer retention and loyalty while simultaneously reducing friction; on the other hand, to minimize the risk of revenues lost for the company.
Many Natural Language Processing (NLP) algorithms were already available. They can process natural language and interpret it to provide outputs according to specific decision criteria.
By combining NLP algorithms, recommendation systems, and best-fit techniques, we could achieve the same goal as we did using Generative AI.
BUT
We ended up choosing Gen AI to implement our solution. Generative AI brought us much closer to the original objectives for several reasons:
Now, let's dive deep into the main takeaways of implementing the solutions that will be essential for the success of all future projects involving Generative AI.
We should consider that Using LLMs can greatly simplify our lives in all contexts where extreme precision is not a critical aspect since providing a new service to the user is already a competitive advantage.
In such scenarios, there is no need to invest time and resources in identifying a specific model to solve a particular problem and invest in training it. Model training, in fact, remains one of the most expensive practices in AI.
LLMs come with training that is already broad enough to be independent from advanced adjustments. This allows us to achieve a satisfactory result in relatively short time frames compared to classical algorithms.
The other side of the coin is that an acceptable result may not be the best we could achieve.
Given that, we should always follow a Continuous Improvement approach, maybe relying on popular "classic" algorithms to work afterward to optimize the output and achieve the best match.
Perfection has no limits!
When it comes to user experience, end-user benefits are often underestimated, but, in this case, the planner has a concrete impact on two dimensions: time and money.
The time spent by the user in planning meals for the following week varies between 5 and 15 minutes (depending on needs, dish selection, and comparison of specific nutritional values).
With the first test alone, the model took 38 seconds to present a plan to the user. As mentioned, various aspects can be improved to reduce time and enhance output. However, in less than 2 minutes, the user can adjust the proposed plan and end up with a complete menu.
This means a time savings of 60%-86%.
Moreover, it automatically solves the problem of users needing to remember to order from one week to the next, thus avoiding the need to go out for lunch or order through delivery services. While the discomfort of not having anything to eat for lunch can't be quantified, we can measure that, on average, having lunch at a restaurant costs 30%-45% more, considering the same amount of food consumed. Therefore, there is an improved user experience factor and indirect cost savings.
From the company's perspective, the planner creates a direct relationship between customer satisfaction and profit. Every time a user forgets to order, a lose-lose scenario is created: the user experiences discomfort, and the company incurs a loss of revenue.
With the planner, the company ensures that there will be revenue every day and maximizes it, as the model will seek to use the maximum number of credits possible when planning meals.
The consumption of Generative AI in terms of economic aspects, energy, computational power, and CPU usage is considerable. It should be kept in mind that we are using an extremely powerful tool to solve specific and narrow problems that are often potentially solvable with far fewer resources (but with greater implementation effort).
It's like "shooting ants with bazookas": you'll certainly succeed in killing them but with lots of considerable side effects.
This raises the sustainability topic, on which Generative AI has a significant impact from various perspectives:
In conclusion, allocating a sustainable budget and planning long-term is essential to ensure the economic sustainability of Generative AI usage, along with raising user awareness regarding payment for premium plans.
When using Generative AI, the workload context can be a friend... or not (or, as we will see, even both), making cost prediction uncertain.
From a purely computational and programming perspective, using LLMs is relatively straightforward because the data processing that generates the output occurs almost automatically, like in a black box.
Since advanced computations are not required (at least in the initial phase), most of the effort (and the needle of the scale affecting the costs) shifts to system integration and understanding the logical context in which the LLM operates.
In using LLMs, in fact, costs depend on the number of tokens (that can be considered as similar to the textual syllables) inputted into the prompt and outputted from the generated result. For this reason, limiting uncertainty in prompt usage whenever possible is essential to avoid uncontrolled expenses.
That said, let's go back to our planner: each textual input causes a cost for our customer that can't be overlooked since, for the solution to function properly, each prompt must always include additional details, including the entire menu characterized by a very long text. This information is necessary and provided automatically by the system itself, along with the user's prompt (meta-prompt).
Optimizing the length and variability of the prompt is difficult, so the cost cannot be predicted with certainty.
We managed to avoid uncontrolled cost explosion with a couple of tricks: a forced limit of 200 characters for the user to express criteria, ensuring an almost stable meta-prompt length;
Through the meta-prompt, we could pass the maximum amount of credits available per person (80 credits, 8€) so that we could limit the number of dishes and, as a consequence, the number of tokens in output.
Although LLMs are not made for mathematical calculations, the result we obtained was acceptable for us since the error in using credits was minimal and for the user since the system ensures that any dishes exceeding the budget will be hidden.
One aspect we have initially underestimated when using LLMs is undoubtedly the concept of "language".
LLMs are not limited to accepting and generating text in human-like language. They can accept input in any formal language based on some grammar and produce output in the same form.
Programming languages are included in this more comprehensive definition: JSON, for example, is a structured language that can be used as input in a prompt and generated consequently in an output. It was possible for us to use the LLM to propose the menu using JSON
It is possible to customize further by indicating the exact form of the JSON expected in the output.
The potential of Generative AI is now evident to everyone. Thanks to the countless opportunities to experience firsthand the power of Gen AI, end-users largely benefit from it.
However, questions and uncertainties arise for those who - like us - build and operate this kind of solution.
Is avoiding traditional AI complexities enough to justify experimentation through Gen AI?
Also, in the face of increasing interest and investments in the sector, companies, even the largest ones, are still trying to figure out how to monetize AI-based technologies fully.
In conclusion, despite their popularity among users, the path toward complete integration and sustainable use seems rather long and uncertain. Maintaining a critical and balanced approach and carefully weighing the benefits and risks associated with these emerging technologies is essential.
What do you think about this topic?
Have you already implemented a Gen AI-based solution?
Let us know in the comments!
Proud2beCloud is a blog by beSharp, an Italian APN Premier Consulting Partner expert in designing, implementing, and managing complex Cloud infrastructures and advanced services on AWS. Before being writers, we are Cloud Experts working daily with AWS services since 2007. We are hungry readers, innovative builders, and gem-seekers. On Proud2beCloud, we regularly share our best AWS pro tips, configuration insights, in-depth news, tips&tricks, how-tos, and many other resources. Take part in the discussion!