大規模言語モデル

2025-03-13 10:26:30 | 英語特許散策

US2022310080(GOOGLE LLC [US])
When the utterance-level confidence score 350 fails to satisfy (e.g., is less than) the confidence threshold (e.g., decision block 450 is “No”),
【００７０】
発声レベルコンフィデンススコア350がコンフィデンス閾値を満足していない場合(例えばコンフィデンス閾値未満である場合)(例えば決定ブロック450が「ノー」である)、

then the confidence-based routine rejects the transcription 204 generated by the first speech recognizer 200 and passes the audio data 202 to the second speech recognizer 402 for processing to re-transcribe the utterance 12 .
コンフィデンスベースのルーチンは、第1の音声認識装置200によって生成された表現形式204を拒否し、発声12を再転記する処理のために音響データ202を第2の音声認識装置402に引き渡す。

The transcription 204 generated by the second speech recognizer 402 may be passed back to the user device 110 and/or to the downstream NLU module for interpretation.
第2の音声認識装置402によって生成された表現形式204はユーザデバイス110に戻すことができ、および/または翻訳のために下流側のNLUモジュールに戻すことができる。

In examples where the first speech recognizer 200 is local and executing on-device 110 and the second speech recognizer 402 is server-side and executing on a remote server 410 ,
第1の音声認識装置200が局所で、かつ、デバイス110上で実行し、また、第2の音声認識装置402がサーバ側で、かつ、遠隔サーバ410上で実行する例では、

the confidence-based routine causes the user device 110 to transmit the audio data 202 to the remote server 410 via a network (not shown) so that the second speech recognizer 402 executing thereon can transcribe the utterance 12 .
コンフィデンスベースのルーチンは、遠隔サーバ410上で実行する第2の音声認識装置402が発声12を転記することができるように、ユーザデバイス110に、ネットワーク(図示せず)を介して音響データ202を遠隔サーバ410に送信させる。

The second speech recognizer 402 may leverage a large language model trained on large-scale language model training data（＊large-scaleはdataに係るのでは？）making the second speech recognizer 402 more suitable for recognizing proper nouns or less-common words not present in the training data used to train the first speech recognizer 200 .
第2の音声認識装置402は、大規模言語モデル訓練データ上で訓練された大型言語モデルを利用して、第2の音声認識装置402を適切な名詞の認識により適したものにすることができ、あるいは第2の音声認識装置402を、第1の音声認識装置200を訓練するために使用される訓練データには存在していない共通ワードがより少ないものにすることができる。

US9619465(GOOGLE INC [US])
[0053] The translation quality of a statistical machine translation (SMT) system can generally be improved by increasing the size of either or both of the translation model (TM) and the language model (LM) of the system.
【００３８】
統計的機械翻訳（ＳＭＴ）システムの翻訳品質は、一般的に、システムの翻訳モデル（ＴＭ）及び言語モデル（ＬＭ）のいずれか又は双方のサイズを大きくすることによって、改善し得る。

Hence, the system 200 may have large translation and language models that need partition in practical implementations in part due to the limited storage capacity in a single machine.
従って、システム２００は、部分的には、単一の機械における限定された記憶容量のために、実用的な実施例では、区画を必要とする大きな翻訳及び言語モデルを有し得る。

As an example, large language models for English can be derived from about 200 billion words to 8 trillion words and are from about 1 Terabyte to 4 Terabytes in size.
一例として、英語用の大規模言語モデルは、約２千億語乃至８兆語から導出することができ、また、サイズが、約１テラバイト乃至４テラバイトである。

A large TM may be on the order of magnitude of 200 million words or larger. As more documents are made available on line, the LM may increase further in size.
大きなＴＭは、２億語以上の大きさのオーダであってよい。より多くの文書がオンラインで利用可能になるにつれて、ＬＭは、サイズが、更に増大し得る。

Hence, partition provides an effective approach to high-quality MT systems using the distributed machine processing.
従って、区画は、分散型機械処理を用いて、高品質ＭＴシステムへの効果的な手法を提供する。

Replication and load balancing can also be used in such DMT systems and other MT systems based on large language and translation models.
複製及び負荷分散は、大きな言語及び翻訳モデルに基づき、そのようなＤＭＴシステム及び他のＭＴシステムにも用い得る。

What is 大規模言語モデル?

大規模言語モデル (LLM, Large Language Model) refers to a type of artificial intelligence model that is trained on vast amounts of text data to understand and generate human language. These models, such as GPT (like me), are based on deep learning techniques, particularly transformers, and are capable of tasks like text generation, translation, summarization, and answering questions.

2025年3月
日	月	火	水	木	金	土
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

和英特許翻訳メモ

便利そうな表現、疑問、謎、その他メモ書き。思いつきで書いてます。 拾った用例は必ずしも典型例、模範例ではありません。

大規模言語モデル

ChatGPT:

このブログの人気記事

コメントを投稿

「英語特許散策」カテゴリの最新記事

goo blog お知らせ

プロフィール

ログイン

goo blog おすすめ

カレンダー

最新記事

カテゴリー

最新コメント

バックナンバー

ブックマーク

便利そうな表現、疑問、謎、その他メモ書き。思いつきで書いてます。
拾った用例は必ずしも典型例、模範例ではありません。