2024年9月のブログ記事一覧-dak ブログ

base64 エンコードされた画像を Cloud Storage にアップロード

2024-09-30 23:00:03 | python

base64 エンコードされた画像を Cloud Storage にアップロードする方法のメモ。

プログラム

アップロード時に content_type を指定しないと、テキストデータとして扱われます。

# usage: uplaod_base64_image_to_gcs.py {bucket} {dir} < {json}

import sys
import base64
import json
from google.cloud import storage

def upload_to_gcs(bucket, dir, obj):
    img_bytes = base64.b64decode(obj['image_base64'])
    blob = bucket.blob(f"{dir}/{obj['file_name']}")
    blob.upload_from_string(img_bytes, content_type='image/jpeg')

def main():
    gcs_bucket = sys.argv[1]
    gcs_dir = sys.argv[2]

    client = storage.Client()
    bucket = client.bucket(gcs_bucket)

    for line in sys.stdin:
        obj = json.loads(line)
        upload_to_gcs(bucket, gcs_dir, obj)

    return 0

if __name__ == '__main__':
    res = main()
    exit(res)

データ

$ cat data.jsonl
{"id":"01","file_name":"01.jpg","image_base64":"/9j/4AAQSkZJ..."}
{"id":"02","file_name":"02.jpg","image_base64":"/9j/4AAQSkZJ..."}

プログラム実行・実行結果確認

$ cat data.jsonl | python upload_base64_image_to_gcs.py gcs-bucket-001 images

$ gsutil ls -L gs://gcs-bucket-001/images/*.jpg
gs://gcs-bucket-001/images/01.jpg:
    ...
    Storage class:          STANDARD
    Content-Length:         28300
    Content-Type:           image/jpeg
    ...
gs://gcs-bucket-001/images/02.jpg:
    ...
    Storage class:          STANDARD
    Content-Length:         21648
    Content-Type:           image/jpeg
    ...

python で画像を base64 エンコード

2024-09-29 12:21:34 | python

python で画像を base64 エンコードする方法のメモ。

プログラム

import sys
import io
import base64
import json


def read_image_file(image_file):
    with open(image_file, 'rb') as inst:
        image = inst.read()
        return image

    return None


def main():
    for i in range(1, len(sys.argv)):
        image_file = sys.argv[i]
        image = read_image_file(image_file)
        image_bytes = io.BytesIO(image).read()
        image_base64 = base64.b64encode(image_bytes).decode('utf-8')
        obj = {
            'id': i,
            'image_base64': image_base64,
        }
        print(json.dumps(obj, ensure_ascii=False))

    return 0

if __name__ == '__main__':
    res = main()
    exit(res)

実行結果

$ python image_base64.py img1.jpg img2.jpg

{"id": 1, "file_name": "img1.jpg", "image_base64": "/9j/4AA..."}
{"id": 2, "file_name": "img2.jpg", "image_base64": "/9j/4AA..."}

BigQuery で日本時間での日付、日時取得

2024-09-29 11:32:25 | BigQuery

BigQuery で日本時間での日付、日時を取得する方法のメモ。

日本時間での日付、日時は current_date()、current_datetime() の引数に 'Asia/Tokyo' を指定することで取得できます。

SQL

select
  current_date() as utc_date
  , current_date('Asia/Tokyo') as jpn_date
  , current_datetime() as utc_datetime
  , current_datetime('Asia/Tokyo') as jpn_datetime
;

実行結果

[{
  "utc_date": "2024-09-29",
  "jpn_date": "2024-09-29",
  "utc_datetime": "2024-09-29T02:26:20.153788",
  "jpn_datetime": "2024-09-29T11:26:20.153788"
}]

テキストファイルの N行目以降を出力

2024-09-24 23:18:30 | linux

テキストファイルの N行目以降を出力する方法のメモ。

tail コマンド

tail -n +N で N行目以降を出力することができます。

$ cat test.txt
1
2
3
4
5
6
7
8
9
10

$ cat test.txt | tail -n +3
3
4
5
6
7
8
9
10

sed コマンド

sed '1,Nd' で先頭の N行スキップします。

$ cat test.txt | sed '1,2d'
3
4
5
6
7
8
9
10

awk コマンド

awk 'NR > N' で先頭N行をスキップします。

$ cat test.txt | awk 'NR > 2'
3
4
5
6
7
8
9
10

jq による JSONL データからのデータ抽出

2024-09-23 23:27:35 | linux

jq による JSONL データからのデータ抽出の例

$ cat test.jsonl
{"abc": "abc", "def": "def"}
{"abc": "ghi", "def", "jkl"}

指定項目を出力

$ cat test.jsonl | jq -c '{abc: .abc}'
{"abc":"abc"}
{"abc":"ghi"}

-c を指定しない場合は以下のような出力となります。

{
  "abc": "abc"
}
{
  "abc": "ghi"
}

Vertex AI のマルチモーダルエンベディング

2024-09-23 23:03:35 | BigQuery

Vertex AI のマルチモーダルエンベディングの実行例のメモ。

画像のエンベディング

概要

外部接続の作成
権限設定
Cloud Storage への画像ファイルのアップロード
オブジェクトテーブルの作成

外部接続の作成

GCP メニューの「BigQuery」を選択。
「+追加」をクリック。
「外部データソースへの接続」を選択。
接続タイプで「Vertex AI リモートモデル」を選択。
「リージョン」：「asia-northeast1」を選択。
「接続を作成」をクリック。

権限設定

外部接続の IAM に対して以下の権限を設定。

GCP メニューの「IAM と管理」を選択。
「アクセス件を付与」をクリック。
「新しいプリンシパル」に外部データへの接続で作成したユーザを指定し、以下の権限を設定。
- Storage オブジェクト管理者
- BigQuery connection Admin

Cloud Storage への画像ファイルのアップロード

$ gsutil cp *.jpg gs://{バケット}/{パス}/

オブジェクトテーブルを作成

以下を実行。

CREATE OR REPLACE EXTERNAL TABLE
  `{プロジェク}.{データセット}.{テーブル}`
WITH CONNECTION
  `{リージョン}.{接続ID}`
OPTIONS (
  object_metadata = 'SIMPLE',
  uris = ['gs://{バケット}/{パス}/*.jpg']
);

画像のエンベディング

select
  regexp_extract(uri, '/([0-9]+).jpg$') as id
  , ml_generate_embedding_result as image_vecctor
from
  ml.generate_embedding(
    model {データセット}.{テーブル},
    table {データセット}.{オブジェクトテーブル},
    struct(
      true as flatten_json_output,
      512 as output_dimensionality
    )
  )
;

実行中のプロセスをバックグラウンドに移行

2024-09-23 22:19:06 | linux

実行中のプロセスをバックグラウンドで実行する方法のメモ。

いま以下を実行している状態です。

$ cat test.txt | python test.py > result.txt

このまま端末を終了すると、プロセスも終了してしまいます。

そのため、以下でバックグラウンドで実行します。

コマンドを一時停止

ctrl + z でプロセスを一時停止します。

バックグラウンドで再開

$ bg

プロセスをシェルセッションから切り離す

$ disown -h %1

これで端末を終了しても、バックグラウンドでプロセスが実行され続けます。

python でアップロードされた画像を取得する方法のメモ

2024-09-22 20:57:11 | python

ドラッグ＆ドロップでアップロードする画像を指定し、画像をサーバに送信します。

サーバは python の flask を使って作成します。

※テンプレートファイルが html と解釈されてしまっています。

ファイル構成

.
├── image_controller.py
├── image_processor.py
├── image_server.py
└── template
    ├── index.html
    └── upload.html

サーバ本体: image_server.py

from flask import Flask, render_template
from image_controller import ImageController

gconfig = {
    'port': 5001,
    'template': {
        'folder': 'template',
    },
    'index': {
        'url': '/intra/test/upload_image/',
        'template': 'index.html',
    },
    'upload': {
        'url': '/intra/test/upload_image/upload',
        'template': 'upload.html',
    }
}

app = Flask(__name__, template_folder=gconfig['template']['folder'])

@app.route(gconfig['index']['url'], methods=["e;GET"e;])
def view_index():
    html = render_template(gconfig['index']['template'])
    return html

@app.route(gconfig['upload']['url'], methods=["e;POST"e;])
def view_upload():
    ctrl = ImageController(gconfig)
    resp_obj = ctrl.upload()
    html = render_template(
        gconfig['upload']['template'],
        title=resp_obj['title'],
        image_type=resp_obj['image_type'],
        image_base64=resp_obj['image_base64']
    )
    return html

def main():
    app.run(port=gconfig['port'], host='0.0.0.0')
    return 1

if __name__ == '__main__':
    res = main()
    exit(res)

コントローラ: image_controller.py

from flask import request
from image_processor import ImageProcessor

class ImageController:
    def __init__(self, config):
        self.config = config
        self.processor = ImageProcessor(config)

    def upload(self):
        params = {}
        params['title'] = request.form.get('title', '')
        params['image'] = request.files.get('image').stream.read()
        #print(params['image'])
        resp_obj = self.processor.upload(params)
        return resp_obj

処理: image_processor.py

import io
import base64
import imghdr

class ImageProcessor:
    def __init__(self, config):
        self.config = config

    def upload(self, params):
        image_base64 = base64.b64encode(params['image']).decode('utf-8')
        image_type = imghdr.what(io.BytesIO(params['image']))

        obj = {
            'title': params['title'],
            'image_type': f'image/{image_type}',
            'image_base64': image_base64,
        }

        return obj

画像アップロード画面テンプレート: index.html

<!DOCTYPE html>
<html lang="e;ja"e;>
<head>


<title>drag & drop image</title>
</head>
<body>
<form method="e;POST"e; action="e;upload"e; enctype="e;multipart/form-data"e;>
<input id="e;id_input"e; type="e;file"e; name="e;image"e; accept="e;image/*"e; hidden>
<input id="e;id_title"e; type="e;text"e; name="e;title"e; size="e;40"e;>





<button id="e;id_remove"e; type="e;button"e;>削除</button>


<button id="e;id_submit"e; type="e;submit"e;>送信</button>
</form>

<script>
  const uploadObj = document.getElementById("e;id_upload"e;);
  const previewObj = document.getElementById("e;id_preview"e;);
  const noImageObj = document.getElementById("e;id_no_image"e;);
  const imageInputObj = document.getElementById("e;id_input"e;);
  const removeObj = document.getElementById("e;id_remove"e;);
  const submitObj = document.getElementById("e;id_submit"e;);

  // クリック時にはファイル選択
  uploadObj.addEventListener("e;click"e;, () => imageInput.click());

  // プレビュー表示
  imageInputObj.addEventListener("e;change"e;, (e) => {
    const file = e.target.files[0];
    if (!file) return;
    previewObj.removeAttribute("e;hidden"e;);
    previewObj.src = URL.createObjectURL(file);
  }, false);

  //
  uploadObj.addEventListener("e;dragover"e;, (e) => {
    e.stopPropagation();
    e.preventDefault();
    uploadObj.style.background = "e;#cccccc"e;;
  });

  //
  uploadObj.addEventListener("e;dragleave"e;, (e) => {
    e.stopPropagation();
    e.preventDefault();
    uploadObj.style.background = "e;#ffffff"e;;
  });

  // ドロップ
  uploadObj.addEventListener("e;drop"e;, (e) => {
    e.stopPropagation();
    e.preventDefault();
    uploadObj.style.background = "e;#ffffff"e;;
    const files = e.dataTransfer.files;
    if (files.length > 1) return;
    imageInputObj.files = files;
    previewObj.removeAttribute("e;hidden"e;);
    previewObj.src = URL.createObjectURL(imageInputObj.files[0]);
  });

  // 削除
  removeObj.addEventListener("e;click"e;, (e) => {
    previewObj.setAttribute("e;hidden"e;, true);
    previewObj.src = "e;"e;;
  });
</script>
</body>
</html>

サーバレスポンステンプレート: upload.html

<html>
<head>
</head>
<body>
画像アップロード
{{ title }}


</body>
</html>

python で imghdr による画像形式の判定

2024-09-22 20:19:12 | python

python で imghdr による画像形式の判定する方法のメモ。

ここでは画像ファイル名で画像を指定した場合と、画像データを指定した場合の判定方法を検証します。

import sys
import io
import imghdr


def read_image_file(image_file):
    image = None
    with open(image_file, 'rb') as inst:
        image = inst.read()

    return image


def main():
    image_file = sys.argv[1]

    # 画像ファイルで判定
    res = imghdr.what(image_file)
    print(f'image file: {res}')

    # 画像バイトデータで判定
    image = read_image_file(image_file)
    res = imghdr.what(io.BytesIO(image))
    print(f'image data: {res}')

    return 0


if __name__ == '__main__':
    res = main()
    exit(res)

jpg ファイル、png ファイルの画像判定結果では、それぞれ正しく判定しています。

$ python test_imghdr.py jpg1.jpg
image file: jpeg
image data: jpeg

$ python test_imghdr.py png1.png
image file: png
image data: png

jpg ファイルの拡張子を png に変更しても、正しく判定することができます。

$ python test_imghdr1.py jpg1.png

image file: jpeg
image data: jpeg

apache の proxy によるアプリケーションの公開

2024-09-22 18:04:53 | linux

flask 等で作成したアプリケーションを apache の proxy の設定で 80 番ポートで公開する方法のメモ。

ここでは、5001 番ポートで /app/test/ ディレクトリでアプリケーションを作成したものとします。

/etc/httpd/conf/httpd.conf に設定を追加

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so

ProxyPass /app/test/ http://127.0.0.1:5001/app/test/
ProxyPassReverse /app/test/ http://127.0.0.1:5001/app/test

apache を再起動

$ sudo apachectl restart

http 通信の許可

/var/log/httpd/error_log に以下のログが出力されている場合には、setsebool コマンドで http 通信を許可します。

(13)Permission denied: AH00957: http: attempt to connect to ...

$ sudo setsebool -P httpd_can_network_connect 1

firewall-cmd による他ホストからの http アクセス許可

2024-09-22 14:04:43 | linux

firewall-cmd による他ホストからの http アクセス許可方法のメモ。

Windows 上に VirtualBox でインストールした Linux の web サーバに、Windows からアクセスができなかったため、firewall-cmd でアクセス許可の設定を行いました。

公開サービスの確認

以下で services に http が出力されていないため、http が公開されていません。

$ sudo firewall-cmd --info-zone public
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s3
  sources:
  services: cockpit dhcpv6-client ssh
  ports:
  protocols:
  forward: yes
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

http サービスの確認

$ sudo firewall-cmd --info-service http
http
  ports: 80/tcp
  protocols:
  source-ports:
  modules:
  destination:
  includes:
  helpers:

http を公開設定

$ sudo firewall-cmd --permanent --zone public --add-service http
success

設定を反映

$ sudo firewall-cmd --reload
success

公開サービスの確認

以下で services に http が出力されていることが確認できます。

$ sudo firewall-cmd --info-zone public
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: enp0s3
  sources:
  services: cockpit dhcpv6-client http ssh
  ports:
  protocols:
  forward: yes
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

python で例外のメッセージを stderr に出力

2024-09-18 23:33:53 | python

python で例外のメッセージを strerr に出力する方法のメモ。

print({例外}, file=sys.stderr) で例外を stderr に出力することができます。

import sys
import json;

try:
    a = '[0, 1, 2'
    obj = json.loads(a)
except Exception as e:
    print(e, file=sys.stderr)

実行結果

Expecting ',' delimiter: line 1 column 9 (char 8)

str() で例外オブジェクトを文字列化してから出力することもできます。

import sys
import json;

try:
    a = '[0, 1, 2'
    obj = json.loads(a)
except Exception as e:
    sys.stderr.write(str(e) + '\n')

出力結果

Expecting ',' delimiter: line 1 column 9 (char 8)

python で json5

2024-09-03 23:24:47 | python

python の json5 ライブラリの利用方法のメモ。

json5 では以下のような json としてはエラーになる文字列でもエラーになりません。

インストール

pip install json5

json5 での parse

> import json5
> a = '{"a": 123, "b": 456, }'
> json5.loads(a)
{'a': 123, 'b': 456}
> b = '[1, 2, ]'
> json5.load2(b)
[1, 2]

json5 での文字列化

> a = '{"a-b": 1, "a_b": 2}'
> obj = json5.loads(a)
> json5.dumps(obj)
'{"a-b": 1, a_b: 2}'

記事一覧 | 画像一覧 | フォロワー一覧 | フォトチャンネル一覧

検索

バックナンバー

2025年03月

2025年02月

2025年01月

2024年12月

2024年11月

2024年10月

2024年09月

2024年08月

2024年07月

2024年06月

2024年05月

2024年04月

2024年03月

2024年02月

2024年01月

2023年12月

2023年11月

2023年10月

2023年09月

2023年08月

2023年07月

2023年05月

2023年04月

2023年03月

2023年02月

2023年01月

2022年12月

2022年11月

2022年10月

2022年09月

2022年08月

2022年07月

2022年06月

2022年05月

2022年04月

2022年03月

2022年02月

2022年01月

2021年12月

2021年11月

2021年10月

2021年09月

2021年07月

2021年06月

2021年04月

2021年03月

2021年02月

2021年01月

2020年11月

2020年09月

2020年08月

2020年07月

2020年06月

2020年05月

2020年04月

2020年03月

2020年02月

2019年12月

2019年11月

2019年10月

2019年09月

2019年08月

2019年07月

2019年06月

2019年04月

2019年02月

2019年01月

2018年12月

2018年11月

2018年10月

2018年09月

2018年07月

2018年06月

2013年09月

2013年06月

2012年07月

2012年06月

2012年05月

2012年01月

2011年11月

2011年09月

2011年08月

2011年07月

2011年06月

2011年05月

2011年04月

2011年03月

2011年02月

2011年01月

2010年12月

2010年11月

2007年05月

2007年03月

2007年02月

2007年01月

2006年12月

2006年11月

2006年10月

2006年09月

2006年08月

2006年07月

2006年06月

2006年05月

2006年04月

2006年03月

カレンダー

2024年9月
日	月	火	水	木	金	土
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

前月

次月

goo blog おすすめ

	おすすめブログ
	【コメント募集中】お気に入りの「道の駅」ありますか？

@goo_blog

お客さまのご利用端末からの情報の外部送信について

goo blog お知らせ

	【11/18】goo blogサービス終了のお知らせ
	【PR】プロ直伝・dポイントをザクザクためる術
	【PR】安い＆大量の「訳あり商品」がヤバい!
	【コメント募集中】お気に入りの「道の駅」ありますか？

python、rubyなどのプログラミング、MySQL、サーバーの設定などの備忘録。レゴの写真も。

プログラム

データ

プログラム実行・実行結果確認

実行結果

SQL

実行結果

tail コマンド

sed コマンド

awk コマンド

指定項目を出力

画像のエンベディング

外部接続の作成

権限設定

Cloud Storage への画像ファイルのアップロード

オブジェクトテーブルを作成

画像のエンベディング

コマンドを一時停止

バックグラウンドで再開

プロセスをシェルセッションから切り離す

ファイル構成

サーバ本体: image_server.py

コントローラ: image_controller.py

処理: image_processor.py

画像アップロード画面テンプレート: index.html

サーバレスポンステンプレート: upload.html

/etc/httpd/conf/httpd.conf に設定を追加

apache を再起動

http 通信の許可

公開サービスの確認

http サービスの確認

http を公開設定

設定を反映

公開サービスの確認

json5 での parse

json5 での文字列化

検索

最新記事

カテゴリー

バックナンバー

カレンダー

ログイン

goo blog おすすめ

goo blog お知らせ