2022年4月のブログ記事一覧-dak ブログ

JavaScript で実行中の script の URL を取得する方法

2022-04-27 00:09:57 | JavaScript

JavaScript で実行中の script の URL を取得する方法のメモ。
document.currentScript で実行中の script を取得することができます。
そして、document.currentScript.src で JavaScript の URL を取得することができます。
■HTML

<html>
<head>

<title>script</title>
</head>
<body>
</body>
<script type="text/javascript" src="script1.js"></script>
</html>

■JavaScript (script1.js)

const script_url = document.currentScript.src;
console.log('script url:' + script_url);

const url_obj = new URL(script_url)
console.log('origin: ' + url_obj.origin);

■実行結果

script url: http://localhost/script/script1.js
origin: http://localhost

pythonia による node.js からの python の関数の実行

2022-04-16 15:19:27 | Node.js

pythonia で使うことで、node.js (TypeScript) から python の関数を実行することができます。

■python プログラム (list.py)

def list_sum(lst):
    s = 0
    for val in lst:
        s += val
    return s

def list_mul(lst):
    m = 0
    for i, val in enumerate(lst):
        if i == 0:
            m = val
        else:
            m *= val
    return m

■TypeScript プログラム

import process from 'process';
import { python } from 'pythonia';

(async () => {
  const list = await python('./list.py'); // ./ が必要
  const lst = [1, 2, 4, 6, 8];

  const s = await list.list_sum(lst);
  console.log(s);

  const m = await list.list_mul(lst);
  console.log(m);

  process.exit(0);
})();

■実行結果
加算、乗算の計算結果が得られます。

21
384

ただし、以下のエラーメッセージが出力されます。

Exception ignored in: <function Proxy.__del__ at 0x7f404f6e7f70>
...
SystemExit: 1

node.js からの python プログラム実行

2022-04-16 12:24:25 | Node.js

Python-Shell を使って、node.js (TypeScript) から python のプログラムを実行する方法のメモ。

Python-Shell では node.js から python のメソッドを直接呼び出すわけではなく、
別プロセスで python プログラムを実行し、標準入出力で python プログラムとやりとりします。

今回は、以下のようなプログラムを作成してみました。
・python プログラム
　標準入力から1行読み込み、空白区切りで行を分割し、合計値を計算します。
　例： "1 2 3" => 6

・TypeScript プログラム
　空白区切りの数字文字列を python プログラムに送り、結果を取得して表示します。

■python プログラム
行毎に flush しないと、python 側で無限に結果待ちの状態となります。

import sys
import re
import json

def proc_record(res, line):
    line = re.sub('[\r\n]+$', '', line)
    items = re.split('[ ]+', line)
    res['result'] = sum([int(item) for item in items])

def main():
    for line in sys.stdin:
        res = {
            'status': 0,
            'message': 'OK',
            'result': None,
        }

        try:
            proc_record(res, line)
        except Exception as e:
            res['status'] = 1
            res['message'] = 'error'
        finally:
            print(json.dumps(res, ensure_ascii=False))
            sys.stdout.flush()

    return 0

if __name__ == '__main__':
    res = main()
    exit(res)

■TypeScript プログラム

import { PythonShell } from 'python-shell';

async function recv(pysh: PythonShell): Promise<any> {
  return new Promise((resolve) => {
    pysh.on('message', (msg) => {
      resolve(msg);
    });
  });
}

(async () => {
  const pysh = new PythonShell('sum.py');

  pysh.send('1 2 3');
  const res1 = await recv(pysh);
  console.log(res1);

  pysh.send('1 3 5');
  const res2 = await recv(pysh);
  console.log(res2);

  pysh.end(() => {});
})();

■実行結果

{"status": 0, "message": "OK", "result": 6}
{"status": 0, "message": "OK", "result": 9}

cluster によるマルチプロセスでの express の http サーバ

2022-04-09 14:15:22 | Node.js

cluster を使うと、マルチプロセスで処理を行うことができます。
cluster では、親となる master が fork() して子の worker を生成します。

以下では、woker が http リクエストを処理し、master は worker が異常終了した場合に、
再度 worker を生成するようにしています。

import os from 'os';
import process from 'process';
import cluster from 'cluster';
import express from 'express';

const port = 8103;
const pid = process.pid;
console.log(`pid: ${pid}`);

if (cluster.isMaster) {
  console.log(`master: ${pid}`);
  const num_cpus = os.cpus().length;
  for (let i = 0; i < num_cpus * 3; i++) { // CPU がひとつのため x 3 している
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    const wpid = worker.process.pid;
    console.log(`exit: ${wpid}`);
    cluster.fork();
  });
}
else {
  const app = express();

  app.get('/', (http_req, http_res) => {
    console.log(`receive: ${pid}`);
    http_res.send(`worker_pid: ${pid}\n`);
  });

  app.listen(port, () => {
    console.log(`http server start: ${pid}`);
  });
}

■実行例
最初にプロセスIDを出力するようにしているため、master のプロセスIDが出力されます。

pid: 9394
master: 9394

続いて、worker のプロセスID が出力され、その後 http サーバの初期化が行われます。

pid: 9401
pid: 9402
pid: 9403
http server start: 9401
http server start: 9402
http server start: 9403

■http リクエスト送信
http リクエストを何度か送信すると、http クライアントにはリクエストを処理した worker のプロセスIDが返却されます。

curl 'http://localhost:8103/'
-->
pid: 9403

curl 'http://localhost:8103/'
-->
pid: 9402

curl 'http://localhost:8103/'
-->
pid: 9401

■worker を kill
pid=9401 の worker を kill します。

kill -9 9401

すると、master が新たに worker を生成します。

exit: 9401
pid: 9430
http server start: 9430

http リクエストを送信すると、新たな worker もレスポンスを返します。

curl 'http://localhost:8103/'
-->
pid: 9430

■master を kill
master を kill すると、worker も終了します。

kill -9 9394
ps
    PID TTY          TIME CMD
   5923 pts/1    00:00:00 bash
   9565 pts/1    00:00:00 ps

node.js 版の kuromoji での形態素解析

2022-04-08 23:37:06 | Node.js

node.js 版の kuromoji で形態素解析をしてみました。
今回は TypeScript で kuromoji を使います。

kuromoji のインストール方法は以下の通りです。

npm install kuromoji
npm install @types/kuromoji

kuromoji を使いやすくするため、以下のようにライブラリ化しました。
■KuromojiUtil.ts

/**
 * Kuromoji Util
 */

import kuromoji, { Tokenizer, IpadicFeatures, TokenizerBuilder } from 'kuromoji';

export default class KuromojiUtil {
  private constructor() {}

  public static getTokenizer(paramh: any): Promise<Tokenizer<IpadicFeatures>> {
    const builder: TokenizerBuilder<IpadicFeatures> =
      kuromoji.builder(paramh);

    return new Promise<Tokenizer<IpadicFeatures>>(done => {
      builder.build((err, tknzr) => {
        done(tknzr);
      });
    });
  }
}

上記の KuromojiUtil.ts を使って形態素解析を行います。

import KuromojiUtil from './KuromojiUtil';

(async () => {
  const paramh = {
    dicPath: 'node_modules/kuromoji/dict',
  };

  const tknzr = await KuromojiUtil.getTokenizer(paramh);
  const text = '日本語の文です';
  const tkns = tknzr.tokenize(text);
  console.log(tkns);
})();

■実行結果

[
  {
    word_id: 2591070,
    word_type: 'KNOWN',
    word_position: 1,
    surface_form: '日本語',
    pos: '名詞',
    pos_detail_1: '一般',
    pos_detail_2: '*',
    pos_detail_3: '*',
    conjugated_type: '*',
    conjugated_form: '*',
    basic_form: '日本語',
    reading: 'ニホンゴ',
    pronunciation: 'ニホンゴ'
  },
  {
    word_id: 93100,
    word_type: 'KNOWN',
    word_position: 4,
    surface_form: 'の',
    pos: '助詞',
    pos_detail_1: '連体化',
    pos_detail_2: '*',
    pos_detail_3: '*',
    conjugated_type: '*',
    conjugated_form: '*',
    basic_form: 'の',
    reading: 'ノ',
    pronunciation: 'ノ'
  },
  {
    word_id: 2475380,
    word_type: 'KNOWN',
    word_position: 5,
    surface_form: '文',
    pos: '名詞',
    pos_detail_1: '一般',
    pos_detail_2: '*',
    pos_detail_3: '*',
    conjugated_type: '*',
    conjugated_form: '*',
    basic_form: '文',
    reading: 'ブン',
    pronunciation: 'ブン'
  },
  {
    word_id: 23760,
    word_type: 'KNOWN',
    word_position: 6,
    surface_form: 'です',
    pos: '助動詞',
    pos_detail_1: '*',
    pos_detail_2: '*',
    pos_detail_3: '*',
    conjugated_type: '特殊・デス',
    conjugated_form: '基本形',
    basic_form: 'です',
    reading: 'デス',
    pronunciation: 'デス'
  }
]

chevrotain で簡易な正規表現の構文解析

2022-04-03 17:51:00 | Node.js

import { CstParser, Lexer, createToken, Rule } from 'chevrotain'

// lexer
const STR = createToken({ name: "STR", pattern: /[^()|?]+/ });
const LP = createToken({ name: "LP", pattern: /[(]/ });
const RP = createToken({ name: "RP", pattern: /[)]/ });
const PIPE = createToken({ name: "PIPE", pattern: /[|]/ });
const QM = createToken({ name: "QM", pattern: /[?]/ });

const allTokens = [
  STR,
  LP,
  RP,
  PIPE,
  QM,
];

const lexer = new Lexer(allTokens, { positionTracking: "onlyOffset" });
  

// parser
class MatchOrParser extends CstParser {
  public value_stack: any[] = [];
  
  constructor() {
    super(allTokens);
    this.performSelfAnalysis();
  }
  
  public root = this.RULE("root", () => {
    this.SUBRULE1(this.exprs);
  });
  
  public exprs = this.RULE("exprs", () => {
    this.SUBRULE1(this.expr);
    this.MANY(() => {
      this.SUBRULE2(this.expr);
    });
  });
  
  public expr = this.RULE("expr", () => {
    this.SUBRULE(this.or_expr);
  });
  
  public or_expr = this.RULE("or_expr", () => {
    this.OR([
      { ALT: () => {
	this.CONSUME(LP);
	this.SUBRULE1(this.str);
	this.SUBRULE2(this.or_strs);
	this.CONSUME(RP);
	this.OPTION(() => { this.CONSUME(QM); });
      }},
      { ALT: () => {
	this.SUBRULE3(this.str);
      }},
    ]);
  });
  
  public or_strs = this.RULE("or_strs", () => {
    this.MANY(() => {
      this.CONSUME(PIPE);
      this.SUBRULE(this.str);
    });
  });
  
  public str = this.RULE("str", () => {
    this.CONSUME(STR);
  });
}


const parser = new MatchOrParser();
const BaseCstVisitor = parser.getBaseCstVisitorConstructor();


class MatchOrVisitor extends BaseCstVisitor {
  public constructor() {
    super();
    this.validateVisitor();
  }

  root(ctx: any) {
    const v = this.visit(ctx.exprs);
    return {
      type: "root",
      exprs: v.exprs,
    };
  }
  
  exprs(ctx: any) {
    const ret = {
      type: "exprs",
      exprs: [],
    };
    
    for (let e of ctx.expr) {
      let v: any = this.visit(e);
      //console.log(JSON.stringify(v));
      ret.exprs.push(v as never);
    }

    return ret;
  }
  
  expr(ctx: any) {
    if (ctx.or_expr) {
      const v1 = this.visit(ctx.or_expr);
      return v1;
    }
    else {
      const v2 = this.visit(ctx.str);
      return v2;
    }
  }
  
  or_expr(ctx: any) {
    if (ctx.or_strs) {
      const v1 = this.visit(ctx.str);
      const v2 = this.visit(ctx.or_strs);
      return {
	type: "or_expr",
	exprs: [v1.str].concat(v2.strs),
	qm: ctx.QM ? true : false,
      };
    }
    else {
      const v = this.visit(ctx.str);
      return {
	type: "str",
	str: v.str,
      };
    }
  }

  or_strs(ctx: any) {
    const ret = {
      type: "or_strs",
      strs: [],
    };

    for (let e of ctx.str) {
      let v = this.visit(ctx.str);
      ret.strs.push(v.str as never);
    }

    return ret;
  }

  str(ctx: any) {
    return {
      type: "str",
      str: ctx.STR[0].image,
    };
  }
}


const visitor = new MatchOrVisitor();

const texts = [
  'あ',
  '(あ|ア)',
  '(あ|ア)?',
  '(あい|アイ)(う|ウ)?(えお|エオ)',
];

for (let text of texts) {
  console.log(text);
  
  let lex_result = lexer.tokenize(text);
  parser.input = lex_result.tokens;
  let cst = parser.root();
  //console.log(JSON.stringify(cst));
  let res = visitor.visit(cst);
  console.log(JSON.stringify(res));
}

■実行結果

あ
{"type":"root","exprs":[{"type":"str","str":"あ"}]}

(あ|ア)
{"type":"root","exprs":[{"type":"or_expr","exprs":["あ","ア"],"qm":false}]}

(あ|ア)?
{"type":"root","exprs":[{"type":"or_expr","exprs":["あ","ア"],"qm":true}]}

(あい|アイ)(う|ウ)?(えお|エオ)
{"type":"root","exprs":[{"type":"or_expr","exprs":["あい","アイ"],"qm":false},{"type":"or_expr","exprs":["う","ウ"],"qm":true},{"type":"or_expr","exprs":["えお","エオ"],"qm":false}]}

chevrotain での加減算の構文解析結果を visitor で実行

2022-04-03 13:51:47 | Node.js

chevrotain での加減算の構文解析結果を visitor で実行する方法のメモ。

ここでは、以下の加減算の文法を対象とした構文解析器を作成します。
calc -> expr
expr -> val ('+' | '-') val
val -> [0-9]+

lexer で字句解析を行い、parser で構文解析を行います。
そして、visitor で構文解析結果を評価し、演算結果を取得します。

プログラムは以下の通りです。

import { CstParser, Lexer, createToken, Rule } from 'chevrotain'

/**
 * lexer
 */
const Num = createToken({ name: "Num", pattern: /[0-9]+/ });
const Plus = createToken({ name: "Plus", pattern: /[+]/ });
const Minus = createToken({ name: "Minus", pattern: /[-]/ });

const allTokens = [
  Num,
  Plus,
  Minus,
];

const calcLexer = new Lexer(allTokens, { positionTracking: "onlyOffset" });

/**
 * parser
 */
class CalcParser extends CstParser {
  public value_stack: any[] = [];
  
  constructor() {
    super(allTokens);
    this.performSelfAnalysis();
  }
  
  public calc = this.RULE("calc", () => {
    this.SUBRULE(this.expr);
  });
 
  public expr = this.RULE("expr", () => {
    this.SUBRULE1(this.val);
    this.OR([
      { ALT: () => { this.CONSUME1(Plus); }},
      { ALT: () => { this.CONSUME2(Minus); }},
    ]);
    this.SUBRULE2(this.val);
  });

  private val = this.RULE("val", () => {
    this.CONSUME(Num);
  });
}

const parser = new CalcParser();
const BaseCstVisitor = parser.getBaseCstVisitorConstructor();

/**
 * visitor
 */
class CalcVisitor extends BaseCstVisitor {
  public constructor() {
    super();
    this.validateVisitor();
  }

  public calc(ctx: any) {
    const v = this.visit(ctx.expr);
    return {
      type: "calc",
      value: v.value,
    };
  }
  
  public expr(ctx: any) {
    const v_val1 = this.visit(ctx.val[0]);
    const v_val2 = this.visit(ctx.val[1]);
    if (ctx.Plus) {
      return {
	type: "expr",
	value: v_val1.value + v_val2.value,
      };
    }
    else {
      return {
	type: "expr",
	value: v_val1.value - v_val2.value,
      };
    };
  }

  public val(ctx: any) {
    const value = parseInt(ctx.Num[0].image);
    return {
      type: "val",
      value: value,
    };
  }
}

const calc_visitor = new CalcVisitor();

const texts = [
  '1+2',
  '7-3',
  '3*5', // 対象外の文法
];

for (let text of texts) {
  console.log(text);
  
  let lex_result = calcLexer.tokenize(text);
  parser.input = lex_result.tokens;
  let cst = parser.calc();
  let res = calc_visitor.visit(cst);
  console.log(res);
}

■実行結果

1+2
3
7-3
4
3*5
undefined

対象外の文法の場合には、undefined が返却されます。

python での正規表現によるひらがな、カタカナ、漢字の判定

2022-04-01 14:25:15 | python

python での正規表現によるひらがな、カタカナ、漢字の判定方法のメモ。

ひらがな、カタカナはコードポイントの範囲指定でチェックすることができます。
　ひらがな：u+3040 - u+309F
　カタカナ：u+30A0 - u+30FF

また、regex では Script=Hiragana/Katakana/Han を指定することで
ひらがな、カタカナ、漢字の判定を行うことができます。

import sys
import re
import regex

# ひらがな u+3040 - u+309F
str = 'あいうえお'
res = re.match('^[\u3040-\u309F]+$', str)
print(res)
-->
<re.Match object; span=(0, 5), match='あいうえお'>

res = regex.match('^\p{Script=Hiragana}+$', str)
print(res)
-->
<regex.Match object; span=(0, 5), match='あいうえお'>

# カタカナ u+30A0 - u+30FF
str = 'アイウエオ'
res = re.match('^[\u30A0-\u30FF]+$', str)
print(res)
-->
<re.Match object; span=(0, 5), match='アイウエオ'>

res = regex.match('^\p{Script=Katakana}+$', str)
print(res)
-->
<regex.Match object; span=(0, 5), match='アイウエオ'>

# 漢字
str = '漢字'
res = regex.match('^\p{Script=Han}+$', str)
print(res)
-->
<regex.Match object; span=(0, 2), match='漢字'>

# ひらがな＋カタカナ
str = 'ひらがなカタカナ'
res = re.match('^[\u3040-\u309F\u30A0-\u30FF]+$', str)
print(res)
-->
<re.Match object; span=(0, 8), match='ひらがなカタカナ'>

# ひらがな＋カタカナ＋漢字
str = 'ひらがなカタカナ漢字'
res = regex.match('^(?:\p{Script=Hiragana}|\p{Script=Katakana}|\p{Script=Han})+$', str)
print(res)
-->
<regex.Match object; span=(0, 10), match='ひらがなカタカナ漢字'>

lxml で xpath での子孫の検索

2022-04-01 13:48:43 | python

lxml で、あるノードの子孫のノードのみを検索する方法のメモ。

あるノードで node.xpath("//タグ") で検索すると、node が root ノードでなくても
全ノードが検索対象となってしまいます。
子孫のみを検索対象とするには、node.xpath(".//タグ") で検索します。

以下、実行例です。

import sys
import lxml.html

htmlstr = """
<html>
<body>
  <div id="d1">
    <div id="d2-1">
      <div id="d3-1">
        <p id="p4-1">p4-1 text</p>
        <p id="p4-2">p4-2 text</p>
      </div>
      <div id="d3-2"></div>
    </div>
    <div id="d2-2">
      <p id="p3-3">p3-3 text</p>
      <p id="p3-4">p3-4 text</p>
    </div>
  </div>
</body>
</html>
"""

dom = lxml.html.fromstring(htmlstr)

■id 指定でノードを検索

node = dom.xpath("//div[@id='d3-1']")[0]
print(node.tag)
print(node.attrib)
-->
div
{'id': 'd3-1'}

■//p で検索すると、子孫以外のノードも含まれます

ps = node.xpath("//p")
print(len(ps))
for p in ps:
    print(p.attrib)
-->
4
{'id': 'p4-1'}
{'id': 'p4-2'}
{'id': 'p3-3'}
{'id': 'p3-4'}

■.//p で検索すると、子孫のノードのみになります

ps = node.xpath(".//p")
print(len(ps))
for p in ps:
    print(p.attrib)
    print(p.text)
-->
2
{'id': 'p4-1'}
p4-1 text
{'id': 'p4-2'}
p4-2 text

記事一覧 | 画像一覧 | フォロワー一覧 | フォトチャンネル一覧

検索

バックナンバー

2025年03月

2025年02月

2025年01月

2024年12月

2024年11月

2024年10月

2024年09月

2024年08月

2024年07月

2024年06月

2024年05月

2024年04月

2024年03月

2024年02月

2024年01月

2023年12月

2023年11月

2023年10月

2023年09月

2023年08月

2023年07月

2023年05月

2023年04月

2023年03月

2023年02月

2023年01月

2022年12月

2022年11月

2022年10月

2022年09月

2022年08月

2022年07月

2022年06月

2022年05月

2022年04月

2022年03月

2022年02月

2022年01月

2021年12月

2021年11月

2021年10月

2021年09月

2021年07月

2021年06月

2021年04月

2021年03月

2021年02月

2021年01月

2020年11月

2020年09月

2020年08月

2020年07月

2020年06月

2020年05月

2020年04月

2020年03月

2020年02月

2019年12月

2019年11月

2019年10月

2019年09月

2019年08月

2019年07月

2019年06月

2019年04月

2019年02月

2019年01月

2018年12月

2018年11月

2018年10月

2018年09月

2018年07月

2018年06月

2013年09月

2013年06月

2012年07月

2012年06月

2012年05月

2012年01月

2011年11月

2011年09月

2011年08月

2011年07月

2011年06月

2011年05月

2011年04月

2011年03月

2011年02月

2011年01月

2010年12月

2010年11月

2007年05月

2007年03月

2007年02月

2007年01月

2006年12月

2006年11月

2006年10月

2006年09月

2006年08月

2006年07月

2006年06月

2006年05月

2006年04月

2006年03月

カレンダー

2022年4月
日	月	火	水	木	金	土
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

前月

次月

goo blog おすすめ

	おすすめブログ
	【コメント募集中】お気に入りの「道の駅」ありますか？

@goo_blog

お客さまのご利用端末からの情報の外部送信について

goo blog お知らせ

	【11/18】goo blogサービス終了のお知らせ
	【PR】プロ直伝・dポイントをザクザクためる術
	【PR】安い＆大量の「訳あり商品」がヤバい!
	【コメント募集中】お気に入りの「道の駅」ありますか？

dak ブログ

python、rubyなどのプログラミング、MySQL、サーバーの設定などの備忘録。レゴの写真も。

JavaScript で実行中の script の URL を取得する方法

pythonia による node.js からの python の関数の実行

node.js からの python プログラム実行

cluster によるマルチプロセスでの express の http サーバ

node.js 版の kuromoji での形態素解析

chevrotain で簡易な正規表現の構文解析

chevrotain での加減算の構文解析結果を visitor で実行

python での正規表現によるひらがな、カタカナ、漢字の判定

lxml で xpath での子孫の検索

検索

最新記事

カテゴリー

バックナンバー

カレンダー

goo blog おすすめ

goo blog お知らせ

python、rubyなどのプログラミング、MySQL、サーバーの設定などの備忘録。レゴの写真も。

検索

最新記事

カテゴリー

バックナンバー

カレンダー

ログイン

goo blog おすすめ

goo blog お知らせ