Google Translate 'learned' 110 new languages: among them Crimean Tatar and Chechen - ForumDaily
The article has been automatically translated into English by Google Translate from Russian and has not been edited.
Переклад цього матеріалу українською мовою з російської було автоматично здійснено сервісом Google Translate, без подальшого редагування тексту.
Bu məqalə Google Translate servisi vasitəsi ilə avtomatik olaraq rus dilindən azərbaycan dilinə tərcümə olunmuşdur. Bundan sonra mətn redaktə edilməmişdir.

Google Translate 'learned' 110 new languages: among them Crimean Tatar and Chechen

Google Translate overcomes language barriers to help people communicate and better understand the world around them. The company is constantly using the latest technology so that more users can use this tool, says the Google blog.

Фото: Depositphotos

In 2022, Google Translate added 24 new languages ​​using Zero-Shot Machine Translation. This is a machine translation method in which the model is able to translate text from one language to another without seeing a single example of translation between these languages ​​during training. This approach differs from traditional machine translation methods, which require large volumes of bilingual text to train the model.

In addition, Google announced the 1000 Languages ​​Initiative, a commitment to create AI models that support the 1000 most common languages ​​in the world.

Google Translate now uses artificial intelligence to expand the variety of languages ​​it supports. With the large PaLM 2 model, the company is launching 110 new languages ​​in Google Translate. This model represents one of the most advanced advances in the field of AI and natural language processing. PaLM 2 is used for various tasks such as machine translation, generation, and text analysis, among others.

On the subject: Under the hood of Google: how to get rid of surveillance on the Internet

Translation for more than half a billion people

From Cantonese to Q'eqchi, these new languages ​​are spoken by more than 614 million people, about 8% of the world's population. Some of them belong to the major world languages ​​with more than 100 million speakers. Others are used by small indigenous communities. And some languages ​​have almost no speakers left, but nationalities are making every effort to revive them.

About a quarter of new languages ​​come from Africa, including Fon, Kikongo, Luo, Ga, Swazi, Venda and Wolof.

Here are some of the new languages ​​supported in Google Translate:

  • Afajian is a tonal language spoken in Djibouti, Eritrea and Ethiopia. Of all the languages ​​in the new AI model, Afaji received the most contributions from the volunteer community.
  • Cantonese has long been one of the most requested languages ​​for Google Translate. Since this language often overlaps with Mandarin, it was quite difficult to train the program.
  • Manx is the Celtic language of the Isle of Man. The last native speaker died in 1974. But thanks to the Manx language revival movement, thousands of Manx speakers appeared on the island.
  • Nko is a standardized form of the West African Manding languages, combining many dialects into a common language. Its unique alphabet was invented in 1949, and today this language is actively developed and used.
  • Punjabi (Shahmukhi) is a variety of Punjabi using the Perso-Arabic script (Shahmukhi). It is the most widely spoken language in Pakistan.
  • Tamazight (Amazight) is a Berber language spoken throughout North Africa. Despite the numerous dialects, the written text is understandable to all native speakers. Tamazicht uses both the Latin and Tifinagh alphabet. Both are supported by Google Translate.
  • Tok Pisin is a creole language based on English and the lingua franca of Papua New Guinea. If you speak English, try translating the phrase into Tok Pisin - you may be able to understand the meaning!

How Google Translate selects languages

There are many factors to consider when adding new languages ​​to Translate. They have a huge number of variations: regional varieties, dialects, different spelling standards. In fact, many languages ​​do not have one standard option, so it is impossible to choose the “right” one. The company's approach was to prioritize the most commonly used varieties of each language. For example, Romani, spoken by Roma people in Europe, is a language with many dialects. The developed models create text that is closest to the southern Vlash Romani, which is often used on the Internet. But it also includes elements from other dialects, such as Northern Vlach and Balkan Romani.

You may be interested in: top New York news, stories of our immigrants and helpful tips about life in the Big Apple - read it all on ForumDaily New York

The PaLM 2 language model was a key element that was used to learn languages ​​that are close to each other. For example, Awdhi and Marwadi to Hindi or creole languages ​​such as Seychellois Creole and Mauritian Creole to French. As technology advances and we continue to collaborate with expert linguists and native speakers, the company intends to support even more language varieties and spelling standards.

You can find a list of new languages ​​used by Google Translate here.

Read also on ForumDaily:

Nine free apps that will change your life for the better

Seven Gmail features you might not know about

How interviews are conducted at Amazon, Google and Meta: personal experience of IT specialists from Belarus

Google Artificial Intelligence Educational program interpreter
Subscribe to ForumDaily on Google News

Do you want more important and interesting news about life in the USA and immigration to America? — support us donate! Also subscribe to our page Facebook. Select the “Priority in display” option and read us first. Also, don't forget to subscribe to our РєР ° РЅР ° Р »РІ Telegram  and Instagram- there is a lot of interesting things there. And join thousands of readers ForumDaily New York — there you will find a lot of interesting and positive information about life in the metropolis. 



 
1085 requests in 1,255 seconds.