Tonal languages

Introduction

I came across the concept of tonal languages when I was watching some song with my niece. My niece confused a word meaning for another. She wondered why the guy said it. I told her the guy meant another word. That moment I realised the words are the same and started to think about other words where the changes in tone change the meaning. Being a software engineer, I wondered how this could be modelled on a computer, for example, speech recognition and text to speech tasks. I decided to research the concept and learnt called tonal language.

So what are tonal languages?

Tonal languages are languages where the meaning of words differ according to pitch. Pitch is as vital as consonants and vowels for distinguishing one word from another. pitch is used to differentiate lexical or grammatical meaning. About 60% of world languages are tonal.

Other languages use tone to distinguish what they mean but are not tonal, for example, English use of high tone at the end of a sentence to show it’s a question. In English tones are always set by period, exclamation marks and question marks.

Tones in most tonal languages are represented using grave accent for low tone“\”, “- ”and acute “/”accents for high tone, some languages have underdots for example “ẹ ”, “ọ ”, and “ṣ”

A common example of tonal language is mā má mǎ mà in mandarin which translates to mom hemp horse scold. Ma in high tone means mother and Horse when it starts with high then low tone. In Yoruba, an African language Bi can mean give birth, ask, vomit and like.

Some languages only have two tones high and low while languages like Vietnamese have about 7 tones

Regions with tonal languages

Most tonal languages are concentrated around East and Southeast Asia, Central America and sub-Saharan Africa (except Swahili)

Example of tonal languages are Chinese, Vietnamese, Thai, and Punjabi in Asia, Yorùbá, Igbo, Luganda, Ewe and Zulu in Africa

Geography Of Tonal Languages

geogaphy map

Tonal languages are mostly in humid areas

Tonal languages are concentrated in tropical areas with high humidity levels. A study carried out on 3,750 languages from across language families shows that places with humidity have more tonal language because moist air allows vocal folds to move more freely, an aspect that is important in tonal languages for one to make a right tone for words to make sense.

Inhaling dry air dries out mucous membranes which affect vocal folds making it difficult to make a wide range of tonal pitches. This is one of the reasons why tonal languages are less popular in dry areas

Tonal Language Representation In Natural Language Processing. Natural language processing(NLP) is a field of artificial intelligence concerned with building systems that can understand, analyse, manipulate, text summarization and generate text.

NLP tasks include speech recognition, natural language understanding, and natural-language generation. A popular NLP commercial application is google translate and Siri where people can translate from one language to another, from speech to text.

Due to lack of some keyboards lacking key accent markings, tones marks and sub dots are omitted when creating some documents and this affects information retrieval. The lack of diacritics in languages like Yoruba makes it hard for the systems to translate well to other languages.

Diacritics are a sign, for example, an accent, which is written above or below letters to alter their pronunciation. when languages use them when using the Latin alphabet to represent unique sounds

Are Tonal Languages Hard To Learn By Humans And Computer

Talking drum

In Africa, some societies use talking drums for communication by making them produce a sound mimicking human speech.

When I searched for tonal languages most people seemed scared of learning them. Because they think they are hard. When learning a new language most people use textbook, articles online. The drawback of learning through text-based sources is most can’t teach you how to pronounce because some literature omits the tonal markings yet the languages highly rely on tones.

Humans

Honestly, I also got scared, I speak a tonal language because I have grown up listening and interacting in it. (I will be trying to learn a tonal language that is not Bantu dialect soon ). I interacted with a Yoruba friend while researching this post. For example, “awo” can mean ceramic plate, cult, quail. I know you might be thinking that must be confusing but Yoruba uses tone marks which makes it easy to know the word.

Tonal languages can be easy to learn because they have fewer words.

The hard part is learning tone but once you do then you are set. The easiest way to learn a tonal language is through prolonged listening. So if you are trying to learn one I advise you to learn by listening and spend less time reading.

Computer algorithms

NLP tasks rely on context to make predictions; this concept is also applied to tonal languages to determine the meaning of the word. When presented with sufficient training data algorithms can perform better in NLP tasks like text summarization, sentiment analysis, text generation etc on tonal language.

Conclusion

Though tonal languages seem hard to learn if you are exposed to them through listening you will be able to master them. Context also helps to know what word is being used.

Even though there are limited resources on tonal languages, there is an opportunity to improve the performance of them in computer systems for example diacritics prediction where the right diacritics of a word is added.

The small about of available datasets for tonal languages makes it hard to develop models that perform well on NLP tasks. You and I can contribute to making that a thing of the past by collecting data sets, resources of data or writing/transcribing that that can be added to available collections.

Further reading

If you are interested in this topic like I am you should find the following resources helpful. You can also add on the list and share what you discovered.

Tonal languages require humidity

The Diversity of Tone Languages and the Roles of Pitch Variation in Non-tone Languages: Considerations for Tone Perception Research

http://face-music.ch/instrum/uganda_drumen.htm

Natural language processing

Masakhane: Using AI to Bring African Languages Into the Global Conversation.