close

Artificial Intelligence

ArticlesArtificial IntelligenceFeatured

Responsible AI, Ethical AI and CHATGPT

Responsible AI, in simple words, is about developing explainable, ethical, and unbiased models. For instance, Google has published its AI principles – https://ai.google/principles/, which discusses this subject in detail. Similarly, Microsoft has published its AI principles at https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/innovate/best-practices/trusted-ai. These key AI principles should be included as part of the design and development of large language models, as millions of users would view the output out of the AI systems. However, with ChatGPT, many instances fall shorts of these AI principles. 

  • Lack of Trust – Responses can sound plausible even if the output is false (Reference –  https://venturebeat.com/ai/the-hidden-danger-of-chatgpt-and-generative-ai-the-ai-beat/). You can’t rely on the output and need to verify it eventually.
  • Lack of Explainability on how the responses are derived. For instance, if the responses are created from multiple sources, list the source, and give attributions. There might be an inherent bias in the content and how this would be removed before training or filtered from the response. The response can be generated from multiple sources, and was there any priority source that was preferred to generate the response. Currently, ChatGPT doesn’t provide any explainability on the answers.
  • Ethical aspects – One of the examples is around code generation. As part of the generated code, there are no attributions to the original code, author, or license details. For instance, Open source has many licenses (https://opensource.org/licenses/); some might be restrictive. Also, were there any priority open-source repositories preferred during training (or filtering outputs) over others. Questions about the code’s security, vulnerability, and scalability must also be addressed. It is ultimately the accountability and responsibility of the developer to ensure that the code is reviewed, tested, secure, and follows their organization’s guidelines. All the above details should be transparent and addressed. For instance, if customers ask for Certification of Originality for their software application (or if there is a law in the future), this might be a challenge with AI-generated code unless the above is considered.
  • Fairness in responses – An excellent set of principles and an AI Fairness Checklist are outlined in the Microsoft paper –  https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RE4t6dA. Given that ChatGPT is trained using internet content (a massive 570 GB of data sourced from Wikipedia, research articles, websites, books, etc.), it would be interesting to see how many items from the AI Fairness Checklist are followed. For instance, the content might be biased and have limitations. The human feedback used for data labeling might not represent the wider world (Reference – https://time.com/6247678/openai-chatgpt-kenya-workers/). These are some of the reported instances, but many might be undiscovered. Large-scale language models should be designed fairly and inclusively before being released. We should not say that the underlying content was biased; hence, the trained mode inherited the bias. We now have an opportunity to remove those biases from the content itself as we train the models.  
  • Confidentiality and privacy issues – As ChatGPT learns from interactions, there are no clear guidelines on what information would be used for training. If you interact with ChatGPT and end by sharing confidential data or a customer code for debugging, would that be used for training ?. Can we claim ChatGPT is GDPR compliant?

I have listed only some highlights above to raise awareness around this important topic.

We are at a junction where will see many AI advancements like ChatGPT, which millions of users would use. As these large-scale AI models are released, we must embed Responsible AI principles while releasing the models for a safer and trusted environment.

read more
ArticlesArtificial IntelligenceFeaturedViews & Opinions

ChatGPT and Generative AI Similarity syndrome

What is Similarity Syndrome?

Similarity Syndrome is a perception that might be created by large language models like ChatGPT, where any unique work would be ultimately treated as common knowledge or work similarly to others.

For instance,

  • You write a unique quote for your book, and ChaTGPT may end up saying it’s common knowledge and similar to many other quotes instead of attributing it to you.
  • You create an algorithm, and Codex uses your code with other algorithms to create something similar. It will reuse your code and sell it back to you.
  • You create digital art and unique images, and DALL·E 2 uses your image and other images to create similar images. It will sell this as an image library, and you buy it without releasing your image contributed to it.

In general, any piece of content can be used to create something similar. There is no explainability and attribution to the original authors/content creators of how the dynamic response was created. There should be strict laws for copyrights and accountability in AI.

I did an experiment to ask ChatGPT about the quote I wrote in my book. Given below is the interaction

Many websites, through google search, attributed the quote to me (like https://www.goodreads.com/author/quotes/49633.Naveen_Balani).

This is not about quotes but probably any generated content, be it digital arts, music, software code, or marketing content. AI would use your data, generate something similar, not attribute it to you, call it common knowledge, and even sell it to you. We will end up living in an AI world where everything is similar : )

To conclude, AI models should be transparent, explainable, auditable, ethical, and, most importantly, credit the original work created by the authors.

read more
ArticlesArtificial IntelligenceFeaturedViews & Opinions

ChatGPT Even Predicts the Future

ChatGPT seems good at converting facts to fiction. I asked ChatGPT about myself; around 70% of the information is cooked up (see image below).  On a lighter note, maybe it’s predicting the future, which I am unaware of : ). But what about past details? I need a time machine to go back and change it. Hopefully, I will write a post on the Time machine someday.

With Google search, all the correct details and my website shows up. 

With ChatGPT, I expected that at least this information should be correct as its readily available. The sources can be from Linkedin, my website, or other credible sources. There is no logic and no complex deep learning network to apply. The biggest problem with the below response is that there is no explainability and no details on the sources that were used to construct the response. How one can verify the correctness of the response ?. The responses are created dynamically even when not required. Unless you design with explainability in mind, the issues around trust and transparency will not be resolved. There are other issues around bias, ethics, and confidentiality, which I will talk about in future posts.

Unless you know the right answer, it would be difficult to know from the above response as the answers are grammatically right and may sound right. This is a very simple scenario. Moreover, moving from general intelligence to specific verticals like healthcare (for instance, diagnosis), increases the complexity of large language models and poses significant challenges. I have discussed this in one of my previous blogs, discussing the Lack of Domain intelligence for ChatGPT (and other conversational engines).  

I can’t predict the future, but I believe Google Search might be better positioned to take Generative AI and integrate it with its search engine. Hopefully, they follow their 7 AI principles – https://ai.google/principles/ before the release to make AI applications transparent, explainable, and ethical. “Slow but steady wins the race” might come true for Google.

With the AI-powered Bing search engine, some early feedback and interesting facts about long chat sessions going wrong are documented on their website – https://blogs.bing.com/search/february-2023/The-new-Bing-Edge-%E2%80%93-Learning-from-our-first-week.

In my next post, I will talk about an interesting subject -“Similarity syndrome in ChatGPT

read more
ArticlesArtificial IntelligenceFeatured

ChatGPT for Enterprise Adoption

To make ChatPT relevant for enterprise adoption, we need

Domain Adaptability
Domain Intelligence
Explainability 
Transparency
Non-biased
Privacy
Scalability – Compute power for Training and Inference
Lower Environmental Footprint

ChatGPT has a long way to go for enterprise adoption.. What do you think? Where are ChatPT and other large language models in enterprise adoption?

read more
ArticlesArtificial IntelligenceDeep LearningFeaturedViews & Opinions

From Watson to ChatGPT: AI Chatbots and Limitations

The release of ChatGPT and the responses it provided brought back the Conversation AI to the forefront and made Conversation AI available to everyone through a simple web interface. We saw many creative ways to use ChatGPT and how it might impact the future and questions around whether it will replace the Google Search engine and Jobs.

Well, let’s address this question with the below analysis –

From early Watson systems to ChatGPT, a fundamental issue still remains with Conversational AI.

Lack of Domain Intelligence.

While ChatGPT definitely advances in the field of Conversational AI, I like to call out the following from my book- Real AI: Chatbots (published in 2019)

AI can learn but can’t think“.

Thinking would always be left to humans on how to use the output of an AI system. AI systems and their knowledge will always be boxed to what it has learned but can never be generalized (like humans) where domain expertise and intelligence are required.

What is an example of Domain Intelligence?

Take a simple example where you ask the Conversational AI agent to “Suggest outfits for Shorts and Saree”.

Fundamentally, any skilled person would treat them as two different options – matching outfits with Shorts and matching outfits with Saree OR asking clarifying questions, OR suggesting these options are disjoint and can’t be combined.

But with ChatGPT (or any general-purpose Conversational AI), the response was as shown below. Clearly, without understanding the domain and context, trying to fill in some responses. This is a very simple example, but the complexity grows exponentially, for deep expertise and correlation are required -like a doctor recommending options for treatment. This is the precise reason why we saw many failures when AI agents were used in solving Health problems. They tried to train general-purpose AI rather than building domain-expert AI systems.

The other issue with this Generative Dialog AI system is –

Explainability – Making the AI output explainable on how it arrived. I have described this in my earlier blog – Responsible And Ethical AI

Trust and Recommendation Bias – Rght recommendation and adaptability. I have explained this in my earlier blog. – https://navveenbalani.dev/index.php/views-opinions/its-time-to-reset-the-ai-application/

For more details, I have explained this concept in my short  ebook – Real AI: Chatbots (2019) – https://amzn.to/3CmoexC

You can find the book online on my website – https://cloudsolutions.academy/how-to/ai-chatbots/ or enroll for a free video course at https://learn.cloudsolutions.academy/courses/ai-chatbots-and-limitations/

The intent of this blog was to bring awareness on ChatGPT and its current limitations. Any Technology usually has a set of limitations, and understanding these limitations will help you design and develop solutions keeping these limitations in mind.

ChatGPT definitely advances Conversation AI, and a lot of time and effort would have gone into building this. Kudos to the team behind this. Will be interesting to see how future versions of ChatGPT can address the above limitations.

In my view, ChatGPT and other AI chatbots to follow will be similar to any other tool to assist you with the required information, and you will use your thinking and intelligence to get work done.

So, sit back and relax; the current version of ChatGPT will not replace anything which requires thinking and expertise !!.

On a lighter note, this blog is not written by ChatGPT 🙂

read more
ArticlesArtificial IntelligenceFeatured

Responsible and Ethical AI – Building Explainable models

Ethical AI is simple words is about ensuring your AI models are fair, ethical and unbiased.

So how does bias gets into the model. Let’s assume, you are building an AI model which provides salary suggestions for new hires. As part of the building the model, you have taken gender as one of the features and you are using the feature to suggest salary. The model is trying to discriminate salary based on genders. In the past, this bias has been going through human judgements and various social and economical factors ( https://en.wikipedia.org/wiki/Gender_pay_gap) but if you include this bias as part of the new model, this is a recipe for disaster. The whole idea is to build a model which is not biased and suggest salary based on people experience and merits.

Take another example of an application providing restaurant recommendations to a user and allowing a user to book a table. Now, while recommending  new restaurants for the user, the AI application is designed to look at the amount spent in previous transactions and rating of the restaurant (along with other features) and the AI system start recommending restaurants which are costlier. Even though, there might be good restaurants is the vicinity and less costly, the restaurants may not show up as one of the top recommendations. Also more the amount spent by the user, implies more revenues for the restaurant application. So in short, you are steering a class of user towards spending more on high-end restaurants, without the user knowing about it. Does this classify as a bias or a smart revenue generating scheme ?

Ethical AI is great topic for research and debate as you would see a lot of development (as well as usual marketing buzzwords) and governance in this area. 

So how do you ensure your model in Ethical and validate it ?. I am sharing my perspective below –

– Designing the model without bias – Ensure you don’t include the features that can make your model bias. For instance, don’t include gender while predicting the salary packages. Take time to validate the data sources and features being used to build the model.

– Explain the model output  – Designing applications with explainability in mind should be a key design principle. If the user receives an output from an AI algorithm, providing why an output was presented and how relevant it is, should be built into the algorithm. This should empower users to understand why a particular information is being presented and turn on/off any preferences associated with an AI algorithm for future recommendations/suggestions.

– Validate the model – Validate the model with enough test cases. You will also see a lot of offerings (the Ethical AI services) crop up in future around this area. Again, the key is that offering/services needs to be Vertical focus (which understand the domain), rather than pure play horizontal AI services. (else it would end up like chatbots hype – https://navveenbalani.dev/index.php/articles/ai-chatbots-reality-vs-hype/

– Accountability –  Ultimately humans needs to be look at the output from the AI system and take corrective action for critical task. I don’t see machine taking over human intelligence for critical tasks in future. For instance, a cancer treatment option thrown by an AI system needs to be carefully investigated by the doctor, but a fashion website recommending wrong products for a user is not critical and can be corrected later through feedback/learnings.

Going back to the restaurant application, if we design the application with the above guidelines in mind and make the output explainable to the user, we can at minimum have 4 levels of recommendations (shown as tiles in application),  along with an evidence on why a recommendation is being provided – 

  1. Recommending restaurants based on earlier restaurant spends ,ratings, history and preferences of the user
  2. Recommending similar restaurants which are highly rated and less costly based on ratings, history and preferences of the user
  3. Recommending new restaurants based on user history and preferences of the user
  4. Recommendations generated by the system without applying any user preferences.

The revised application now provides various recommendations and enough evidences to back up the recommendation and ultimately the choice is left to the user to pick up the restaurant and book a table.

The above was an example of a very simple application, but imagine when AI is deployed across industries and in government agencies, then developing and monitoring the AI system for ethical principles would be extremely critical. Both the creators of the model, as well as validators (agencies/third party systems etc) validating the model would be critical to ensure AI models are fair, ethical and unbiased. 

As we are creators and validator of the AI system, the onus lies on us (humans) to ensure technology is used for good.

read more
Artificial IntelligenceFeaturedViews & Opinions

It’s time to Reset the AI application?

Do you think AI is changing your thinking ability?  From applications recommending what movies to watch, what songs to listen, what to buy, what to eat, what ads you see, and the list goes on… all are driven by applications learning from you or delivering information through collective intelligence (i.e., people like you, location based etc.).

But are you sure the right recommendation is being provided to you or are you consuming the information as-is and adapting to it? Have you ever thought, would you have reached the same conclusion by applying your research and mental knowledge?

To add on, with information being readily available, less time and mental ability is spent on problem solving and more effort is spent on searching the solutions online.

As we build more smarter applications in future, which keeps learning everything about you, do you think this would change our thinking patterns even further?

Apart from AI systems trying to learn, there can be other ethical issues around trust and bias and how do you design and validate systems that provide recommendations which can be consumed by humans to provide unbiased decisions.  I have covered this, as part of my earlier article – https://navveenbalani.dev/index.php/articles/responsible-and-ethical-ai-building-explainable-models/

As we are creators and validators of the AI system, the onus lies on us (humans) to ensure any technology is used for good.

As standards and compliance are still evolving in AI world, we should start designing systems that should let user decide how to use the application and when to reset it.

I am suggesting few approaches below to drive discussions in this area, which needs contribution from everyone to help deliver smart and transparent AI applications in future.

The Uber Persona Model

All applications build some kind of semantic user profiles incrementally to understand more about the user and provide recommendations. Making this transparent to the user should be first step.

Your application can have various semantic user profiles – one about you, one about your community (similar to you, location based etc..) and how this has been derived over a period of time. Finally your application should have a Reset Profile, that lets you reset your profile or  a “Private AI” profile that enables you to use the application without knowing anything about you and let you discover the required information. Leaving the choice to the end-users on which profile to use, should lead to better control and transparency and making users build trust in the system.

Explainability and Auditability

Designing applications with explainability in mind should be a key design principle. If the user receives an output from an AI algorithm, providing information as to why this output was presented and how relevant it is, should be built into the algorithm. This would empower users to understand why a particular information is being presented and turn on/off any preferences associated with an AI algorithm for future recommendations/suggestions.

For instance, take the example of server auditing, where you have tools that log every request and response, track changes in the environment, assess access controls and risk and provide end-to-end transparency.  

Same level of auditing is required when AI delivers an output – what was the input, what version of model was used, what features were evaluated, what data was used for evaluation, what was the confidence score, what was the threshold, what output was delivered and what was the feedback.

Gamifying the Knowledge Discovery

As information is readily available, how do you make it consumable in a way where you can nudge users to use their mental ability to find solutions, rather than giving all the information in one go. This would be particularly useful on how education in general (especially for schools/universities) , would be delivered to everyone in future.  

How about a google like smart search engine, which delivers information that lets you test your skills. As mentioned earlier, in the Uber Persona Model section, the choice is up to the user to switch on/off this recommendation.

I hope this article, gave you enough insights on this important area.

To conclude , I would say the only difference between AI and we all in future, would be our ability to think wisely and build the future we want.  

read more
ArticlesArtificial IntelligenceCognitive ComputingConferencesFeaturedViews & Opinions

The chatbot hype failed to live up

AI chatbots give a perception of being intelligent, but intelligence is a long way away, says Navveen Balani.

Read my article on why first generation of chatbots did not live up to the hype. The article was featured in August edition of https://www.industrialautomationindia.in/ magazine

Here is the link to the content from the magazine – http://navveenbalani.dev/wp-content/uploads/2020/08/navveen-magazine.pdf

read more
ArticlesArtificial IntelligenceCognitive ComputingDeep LearningFeaturedMachine Learning

AI Chatbots – Reality v/s Hype

AI chatbots gives a perception of being intelligent, but intelligence is a long way away.

Uncover some of the real facts on chatbots and limitations associated with current AI chatbot platform and frameworks. The intent of the article is to help readers take informed decisions on how to design AI chatbots and workarounds with the existing chatbot implementation.

What are Chatbots?

A chatbot is a software program which carries out a conversation with a human. The conversation can be through textual methods, voice or even through recognizing human expressions.

Chatbot interactions can range from simple questions being answered like – “what is the outside temperature’, to sophisticated use cases which requires a series of dialogue to arrive at an outcome – like a chatbot for booking holiday trips or providing financial advice.

What are the technologies used to build Chatbots?

Chatbots are not a new concept. Earlier technologies using fixed set of input from user to drive conversations or scanned the input message to find keywords and lookup information/responses from database. These were mostly rules based and keyword driven, without understanding the context and meaning of the input message. Based on the input, a predefined programmed response would be provided.

With the advent of AI, Chatbots uses technologies like Natural Language processing to understand the language and intent from the input message and take corrective action.  As the system tries to understand the language, users asking the same questions in multiple ways, the system is now able to understand the intent. Once the intent is identified, you can extract the interested topic from the input.

Info – Natural language processing (NLP) is a branch of AI to help systems understand, interpret and process human languages.

 For instance –

Find the cheapest flight from US to UK is similar to Find me lowest air fare from US to UK.

Here the intent is – cheapest or lowest flight

Topics are – Location: From location – US, To Location UK

Action – Search flights.

An AI open source package or an AI NLP cloud service can be used to develop chatbots. Let’s refer to this as chatbot implementation for future references. We would talk about chatbot implementation in detail during the course of this article

What should I keep in mind for developing an AI Chatbot?

Chatbots work well when domain is well understood by the AI system.

As the AI chatbot relies on NLP to understand the semantics of the input message, unless the NLP parser is trained on the domain, the accuracy of recognizing the intent and topics of interest would be very low or not as per acceptable criteria.

Take an example of a shopping chatbot which advises user what to buy based on the latest fashion trends.

Consider 3 queries below from a user –

Query 1 –   Show me medium size trending black and white dresses for Christmas party

Query 2 –   Show me white color, 3 inches platform heels

Query 3 –   Find And black and white floral dress under 2000

Here the chatbot needs to understand the following

  • Understand the shopping language.
  • Understand the intent – It’s a shopping query
  • Understand the domain – Its shopping query for apparel and shoes. (i.e. there can be multiple domains – grocery, electronics, books etc.)
  • Understand clothing shopping category and terminology –
    • Category – dresses, sandals etc.
    • Variants – sizes (medium/large etc.), color (various colors and combinations like black and white), heel size (3 inches. etc.)
    • Prices and ranges – 2000, etc.
    • Brands like – AND, Nike etc.

Out of the box, any chatbot implementation wouldn’t understand the domain. You need to train the chatbot on the custom domain to recognize the context and the language.

For instance, out of the box NLP parsers would not recognize “AND” as a brand. Let’s inspect how well some of the leading Cloud AI NLP services recognizes the sentence – “Find And black and white floral dress under 2000”

Here is a snapshot from Watson NLP (out of the box) implementation.

 Figure: Keywords from Watson NLP

picture1

Figure: Concepts from Watson NLP

picture2  

Figure: Part of Speech from Watson NLPpicture3

As you see, the Watson NLP recognizes “white floral dress” as keywords and “Black” as concept.  Ideally it should have recognized “black and white” as concept as we are looking for a combination of these colors.

The dress could also be a concept, as its quite generic. The floral can be keyword which has a dependency on dress. Identifying all the facts in the right way it’s important, as based on the facts you would convert this to a search query to get the required details from the data store (or from respective search indexes).

For instance, the above should result as –

Color = “black and white”

Category = “Dress”

Gender = “Female”

Price < 2000

Pattern = “floral” or Keyword within category = “floral”

(where color, category, gender, price, pattern are all the columns or indexes you are searching against.)

The Watson NLP parser doesn’t recognize “And” as brand and recognizes “And” as a conjunction (“CCONJ”) in part of speech, which is expected as its not trained on it.

Let’s check how Google NP classifies this sentence. Here is a snapshot from Google NLP.

picture4

  Figure: Entity classification from Google NLP

picture5

Figure: Part of Speech from Google NLP

As you see from above figures, Google NLP identify the entity as “dress”, but doesn’t identify the colors “black and white”. With respect to part of speed tagging, its similar to Watson NLP, recognizing “And” and as conjunction (“CONJ”) and not as brand.

The above is true for any of the available NLP implementation (that is available today), where it fails to understand all the correct context of the sentence. The use case was pretty simple.  Even if we train the NLP implementation on these examples, it would fall short as you need to plugin specific NLP rules for such conditions to get the desired results. As the complexity and context that needs to be inferred increases, training would also not help as you can never come up with a generalized model for such conditions. That is the single most limitation if we only rely on today’s generation of NLP implementation.

Based on my experience on building a sophisticated shopping personalized advisor, none of the out of the box AI NLP implementation fitted the requirements. A simple scenario is these 3 sets of sentences – “black and white dress”, “and black dress” and “blue jeans and white shirt”. In all these 3 examples, the use of word “and” means different meaning. In the first case, its represents a combined color ““black and white”, in second instance “and” represent a brand and in third instance two queries joined by a conjunction (i.e. and). Even with required training, a generalizing model was not possible with any of the available solutions. These are just one of the many examples I am highlighting. Imagine the complexity when dealing with medical literature. In our case, we ended up building our domain specific NLP implementation which worked for all such scenarios.

In general, while designing chatbot solutions, start with a closed domain and what kind of questions the chatbot needs to answer. Don’t start building a general purpose chatbot from start, as it would be difficult to get the required accuracy. Secondly, if you are using any cloud vendor or third party implementation, ensure your use cases can be simply solved by the default implementation or you need to build components to work around it.

What are typical use cases for building a chatbot?

In today’s digital age, customers are looking for instant information and speedy resolution to all their queries.

Chatbots provides an efficient way to stay connected with end customers directly and provides information at their fingertips – be it through a messaging chat application or through voice enabled service like Alexa and Google Home.

Some of the typical use cases are listed below –

  • Ability to know your customers and directly interact with them over various channels, like retail brands directly connecting to their end customers.
  • Improve customer engagement, interaction and provide speedy resolution.
  • Scaling customer service operations by providing relevant information 24 by 7 at customer’s finger trip.
  • Understand customers and their preferences better to provide hyper personalized service, like a personal assistant.
  • Provide an ability to interact with connected devices, like Smart Homes in a natural and intuitive way.
  • Provide expert guidance, like a financial assistant chatbot providing investment suggestions.

What are the high level steps for building an AI chatbot?

Following are the high level steps to build an AI chatbot

  • Define the business use case and end goal for building the chatbot.
  • Define Conversation interfaces
    • Define what kind of questions needs to be answered
    • Define conversation/dialog flow on how various interaction would happen with the user. For instance, booking a flight is one dialog flow, booking a hotel is another dialog flow, etc. Within a dialog flow, what would be interaction flow with the user.
    • Define how to capture the feedback from the user regarding the answers provided. Feedback can be explicit, like the user rating the answer or implicit on how much time a user spends looking at the answer and follow up activity after that.
  • Question / Answer exploration
    • Identify existing sources (if any) for questions, like website FAQ, call center logs etc.
    • Create representation of Questions that would be asked.
    • Create variations of questions for training the chatbot to understand the language and be able to generalize well.
    • Identify source of answers – whether it would be programmed response or coming from internal knowledge sources and documents (like available technical manuals for troubleshooting device related queries)
  • Pick up a Technology approach

In this step, you will decide how to implement the chatbot. There are 2 approaches – building your own chatbot implementation using available frameworks (like TensorFlow, NLP implementations like NLPTK) and custom components or using an existing platform service like Google NLP, Amazon Lex or Azure Bot service.

In both the approaches, you would need the train the chatbot implementation to recognize the question intent, domain and the language. Existing platform services have simplified this process by providing required utilities that makes it easier to create chatbots. For more details, kindly refer to “How do you build chatbot using chatbot platforms”.

  • Pick up a delivery channel

In this step, you will decide how to expose the chatbot to end users through the required channel.  The channel can be web, mobiles or voice enabled devices.

Your chat implementation would typically expose an API (to ask questions and get responses) which can be called by a channel implementation. You can also release your chatbot implementation over third party services like Facebook Messenger or voice enabled services like Amazon Alexa. For more details, kindly refer to “How do you Integrate chatbots with third party services”.

  • Release, Monitoring and Feedback

Once the chatbot is released, you would typically store all the user interactions to help you analyze the user behavior and their preferences better. The user and behavior data in turn would be used to provide a more personalized service. How would you use this new user information, depends on your use case. For instance, if a travel chatbot is recommending a new holiday trip, it can suggest options based on your last trip interaction. You need a build a recommendation system that looks at the history of the user interaction in the past and suggest options. For details on how to build recommendation systems, kindly refer to Recommendations Chapter of Real AI book.

Another important point is to capture feedback from the user at regular intervals to understand if chatbot is providing the right information. The feedback captured will be used to improve the chatbot implementation, which can lead to training the chatbot implementation with new information. For instance, your chatbot may not be trained on recognizing certain entities and concepts and as a result the responses would not be proper. You need to plan for building and releasing incremental models based on the feedback.

How do you Integrate chatbots with third party services?

As part of your chatbot technology implementation, your chat implementation would typically expose an API (to ask questions and get responses) which can be called by a channel implementation

The channel can be web, mobiles or voice enabled devices. If you already have an existing mobile application, you can embed this as part of the mobile application.

You can also release your chatbot implementation through third party chat enabled services like Facebook messaging application or through voice enable service like Amazon Alexa as a skill.

All of these chat enabled services provides a framework to plugin your own implementation. The framework provides hooks or code interceptors for intercepting the chat message. You need to extend their framework and plugin your own implementation For example, if a user asks a question on Facebook messenger, the question would be handed to your chat implementation through predefined hooks. You would process the message and send the response back, which would be sent back to the user.

Similarly, if you need to make your chatbot available over Alexa, you need to wrap it as an Alexa Skill using Alexa Skills Kit interface. Once your skill is enabled in Alexa by the user, any voice messages would be intercepted by your skill and you can provide the required implementation and responses as per your chatbot.

For more details, kindly refer to “How do you build chatbot using chatbot platforms”.

How do you build chatbot using chatbot platforms?

A chatbot platform provides you a set of services to design, develop and deploy your chatbot. They provide you with a framework and guided set of utilities to build a chatbot.

Cloud providers like AWS, Azure, IBM, Google Cloud provides you a set of services that help you to create conversations, understand the conversation language using NLP techniques, hooks to take required action and deliver the solution via APIs.

The fundamental approach adopted by each of these providers is same. They allow developers to

  • Design conversation flows using some visual interface or tooling provided by cloud provider
  • Through these conversation flows you
    • Provide a set of questions and multiple ways you can ask the same question
    • Define what is the Intent of the question. For example, for the question -” Find cheapest flight from US to UK”, the intent is to find the lowest air fare.
    • What entities of interest to extract from the Intent. The chatbot provider needs to be made aware of these entities. In above example, entities are country list -, UK, US. These entities can be generic which are recognized automatically by the cloud provider or the cloud provider provides a mechanism where you can provide or train these entities (including synonyms etc.) through some tooling provided by the cloud provider.
    • Use the entities extracted to carry out the required action for the intent. For instance, in the above example, call a flight API service providing UK and US as “from” and “to” locations.
    • Provide the response.
  • Test and expose the chatbot through an endpoint
    • The cloud vendor typically provides an ability to expose the functionality for your chatbot through an endpoint, like a REST API.

The above technology work for simple to medium complexity flow – like FAQ, pointed questions and answers for customer query, fixed step of steps (booking a cab etc.) etc. Anything which requires sophisticated handling of queries, like the shopping advisor example, needs to be custom developed using NLP and other techniques.

Info – Microsoft has a QnA service (https://www.qnamaker.ai) that lets you create bot from FAQ.

What is not real about Chatbots?

Chatbot are examples of Weak AI. Current generation of chatbot can be thought of smart dialog systems driven through techniques like NLP and fixed conversation flows.

Out of the box, a chatbot doesn’t understand any domain. We need to train the chatbot to understand the domain. Also, based on the complexity of the domain, you would incrementally train and add subdomains. For instance, a chatbot helping you book a cab is an example of fixed domain, while a chatbot helping assisting doctors for cancer treatment would be trained on various types of cancer incrementally. As mentioned in the shopping advisor example, understanding the meaning of the same word in different context is difficult for the current generation of NLP implementation to resolve and you need to rely on custom techniques to handle such conditions.

Now, let’s look at some marketing gimmicks around AI chatbots –

  • INGEST AND KNOW IT ALL chatbots – These are chatbots being marketed where you can ingest millions of documents, like medical literature and can ask questions, which can provide expert assistance like diagnosis of diseases. Such kinds of systems unless trained appropriately will never provide desired accuracy.

By appropriately, I mean it can take years to train these systems. The fundamental problem with these systems is that, they still don’t understand the complete language and complexity of the domain. You typically end up with custom domain adoptions and infinity language rules, which is definitely not smart enough to manage in longer run. The predictions of such systems are usually not accurate.

  • Self learning chatbots – How often you have heard this terminology called self learning chatbots. This again is a misconception, where chatbots learns on its own. You have to train chatbot on what you what the chatbot to learn. Usually you would capture the user behavior details through their interaction with the chatbot application. This would include capturing user analytics information like capturing his likes or dislikes in some way, either through explicit or implicit means. Explicit information can be a user rating a product and implicit can be the time a user spent looking at a response.

Once you know the user well and have its data, its becomes a recommendation problem on what you want to recommend to the user.  So, you end you building a recommendation algorithm to recommend something. For instance, for a fintech application, this would mean recommending similar stocks based on what stock he views regularly or his portfolio.

Different domain and use cases, would need different recommendation algorithms and that needs to be developed as part of the chatbot. However, the learning is boxed, for instance if you have a chatbot which assist you in booking restaurants, it can recommend you similar restaurants, but it can’t recommend your places to stay, it only knows about your restaurants taste.

Well, someone can build a recommendation system which tracks what users eat and where they stay and then try to come up with a correlation which provide a recommendation, as the system now knows – “user eating XYZ, most likely are adventurous. So, recommend some trekking place.” Again, in this case, recommendation is boxed on what you know and what you want to recommend. I don’t know if any such hypotheses exist, but only through data and feedback that can be inferred.  The point is, all of these hypotheses, data and feedback needs to be designed and developed, and saying chatbots learns on its own it’s quite misleading.

  • General purpose, generative chatbots – A chatbot which is capable of learning new concepts from scratch and provide responses like Human. As it learns from open domain, the chatbots would start behaving similar to the famous Microsoft Tay chatbot (which was forced to shut down on its launch day), as it started learning unwanted details from tweets and started posting inflammatory and offensive tweets. This is a classic example of what I quoted earlier – “AI can learn, but can’t think”. The generative chatbots are formulating the response based on probability of words and creating a grammatically correct sentence, without understanding the real meaning of it.

As I mentioned earlier, the first focus should be on getting domain specific chatbots right and with the current techniques we are far away from realizing the vision.

Will chatbots make human agents obsolete?

To answer this question, lets understand what functionality chatbots currently provide.

Current chatbot implementation do well for handling fixed set of dialogs with the user, repetitive tasks and certain initial aspects of customer service tasks. Wherever there is a fixed set of processes and flows to automate, chatbot can be used to provide 24 * 7 support for any queries. If human expertise is used for answering basic set of questions where answers are readily available, it would be eventually be replaced.

But in real-life scenarios, most of the conversation usually doesn’t follow a fixed flow paradigm. But if the conversation moves from basic questions to questions which need further analysis, or the topic of conversation gets changed, you need a sophisticated chatbot implementation to take care of various conversation flows, identify the context switch, identify intents which your chatbot may not be aware of and create queries to find that information from your knowledge source. You are now moving from fixed set of flows to more dynamic flows which needs to be interpreted by your chatbot. Building such complex chatbot implementations requires sophisticated domain specific adoption using machine learning techniques and custom solutions. Current out of the box chatbot services fall short of building such chatbot implementations.

And even if you have all the data in the world at your disposal, infinite processing and computation power, using the current generation technology and research, you can never build a system that can compete with an expert human in the field. Taking even a 5 year horizon from now, I don’t think we can develop such a level of intelligent chatbots.

For instance, can chatbots or an assistant, help doctor to recommend cancer treatments accurately and consistently. The answer is No.

The information provided from chatbot can aid doctors to take a clue from the answer provided, it may be right or wrong. You can never certify this. The chatbot would always act as assistance to an expert person to get some job done. Ultimately, these systems are throwing a bunch of answers based on some probabilities. The answers are limited to what you have fed into the system, you can’t infer a new knowledge on the fly or can correlate information like a human expert to come to any conclusion.

While, there are research going on to use deep neural nets for conversation flows, we   are still quite far away of building truly conversational interfaces which understands the nitty-gritty of language and domain. Also, the answers provided needed to be explainable and unless you have a way to backtrack on why a particular answer was provided, such deep neural systems can’t be used for use cases which requires auditability and explainability.

In short, enjoy the smart chatbots that gives a perception on being intelligent, but intelligence is a long way away.

Can AI generate dynamic responses to questions

You can use deep learning to build a chatbot.  Various deep learning architectures are available to solve specific variety of use cases. For instance, for computer vision (i.e. image recognition) you would use a convolutional neural network as the starting point, for language translation or text generation you would go with recurrent neural network and so on.

For understanding chat conversations, you would start with a variant of recurrent neural network. You will build a sequence to sequence model. A sequence to sequence model in simple words consist of 2 components, the first component (encoder) tries to understand context of input sentence through its hidden layers and the second component (decoder) takes in the output from encoder and generates the response.

The above techniques require you to have a large set of training data, containing questions and responses. The technique works in a closed domain, but as the responses are dynamic in nature, putting it directly to your end users can be a bit risky.  Secondly, these techniques don’t work when you want to interpret the input sentence to extract the information and formulate a response on your own, like the shopping advisor query use case that we discussed above.

In case of open ended domain, the chatbots would start behaving similar to Microsoft Tay chatbot example I gave earlier.

Tip – With RNNs, the response/answer is dependent on its previous states (or earlier states). So, for deep conversational use case, where context needs to be available, the RNNs doesn’t work. You need to employ variants on RNN called LSTM. (Long Short Term Memory networks). There are a lot of research going around this area. Going through various deep learning architecture is outside the scope of the book.

 

SUMMARY

The current generation of chatbots are weak form on AI, which offers an ability to understand the intent of the input message/question. In order for chatbot systems to understand the intent, it needs to be trained with the corresponding domain. You can ask the same question in multiple ways and the chatbot implementation can still infer the intent.

For dialogs, the current technology offers defining fixed conversation flows, so the interactions are boxed and finite.

Chatbots do well for managing productivity and certain aspects of customer service tasks. However, as the complexity of domain increases, current technology falls short, as even after sufficient training you would not get the required level of accuracy. You would need to rely on a combination of other machine language technologies and solutions like rules, inferences, custom domain metadata to get the solution delivered. These become a one-off solution, which becomes difficult to generalize. For some cases, even the one-off solution would be very complex, like building an advisor for recommending cancer treatments accurately and consistently.

While, there are research going on using deep neural nets, we are still quite far away from building a true conversational chatbot which understands the nitty-gritty of language and domain. Also, the answers provided needed to be explainable and unless you have a way to backtrack on why a particular answer was provided, such deep neural systems can’t be used for use cases which requires auditability and explainability.

read more
ArticlesArtificial IntelligenceCognitive ComputingFeatured

Are AI Chatbots Intelligent ?

Welcome to the world of intelligent chatbots, your companion and conversation agents which would make your life smarter. A leading research paper even said by 2020, the average person will have more conversations with bots than with their spouse. So be ready to embrace this new life in a year from now.

Ok hold on, have you ever tried telling Siri or Google to “Find restaurants which doesn’t serve pizza”. At least they are both consistent in some way, they gave the same answer – suggesting restaurants which serve pizza.

Ok how about Sofia, the first citizen humanoid robot, which is making its way to every media event and giving interviews and boost of human like conversations. Well the truth is far from reality, it is providing an illusion of understanding conversation, but as you start asking intelligent questions you would realize it can answer fixed set of questions.

Well by now, you should be able to clear out the noise from reality. So, should I invest in chatbots with all these limitations? Yes, any technology would have its limitations, but you need to be aware of what you can build now, what to avoid and how to work around the limitations. I have seen many companies trying to build sophisticated chatbots using products from leading chatbot vendors and cloud offerings, spending million on dollars and hitting a roadblock.

If you go by what is being projected and start building it out, you would soon realize these limitations one way or the other. The problem is that most of the vendors claim it’s very easy to build a chatbot, but in reality, all of these techniques fall short when it comes to building a true conversational agent.

With current implementations of chatbot, we are probably at the first generation of AI chatbots which are more or less scripted and giving answers to pointed questions. What I mean by scripted is that it is trained to understand general vocabulary, entities, the metaphor, synonyms etc. The chatbot uses fixed set of flows to understand the context. For domain specific use cases, additional training is required, and you need to train on specific domain terminology and relationship between the words. While, there are research going on using deep neural nets, we are still quite far away from building a true conversational chatbot which understands the nitty-gritty of language and domain.

For instance, if you are building a shopping advisor chatbot, the term “black and white dress” implies “black and white” as color and dress as category. You might expect the color “black and white” is fairly generic and should be easily identified by the AI system, but that’s not really the case.

Based on my experience on building a sophisticated shopping personalized advisor, none of the AI NLP implementation fitted the requirements. A simple scenario is these 3 sets of sentences – “black and white dress”, “and black dress” and “blue jeans and white shirt”. In all these 3 examples, the use of word “and” means different meaning. In the first case, its represents a combined color “black and white”, in second instance “and” represent a brand and in third instance two queries joined by a conjunction (i.e. and). Even with required training, a generalizing model was not possible with any of the available solutions. These are just one of the many examples I am highlighting.Imagine the complexity when dealing with medical literature.

In order to get a realistic view of what an AI chatbots can achieve in today’s environment, current limitations and workaround and what to expect in the future, you can refer to my book REAL AI for more details.

read more
1 2 3 4 5 6
Page 3 of 6