Transcriptor
https://github.com/aphananthe42/transcriptor/assets/68156481/d5f3ea30-b681-4f13-8219-b4e0e365709e
**The recorded content of the audio data**
ใๆฌๆฅใฏใๆฅๅ ดใใใ ใใพใใฆใ่ช ใซใใใใจใใใใใพใใ ้ๆผใซๅ
็ซใกใพใใฆใใๅฎขๆงใซใ้กใ็ณใไธใใพใใๆบๅธฏ้ป่ฉฑใชใฉใ้ณใฎๅบใใใฎใฎ้ปๆบใฏใๅใใใ ใใใใพใ่จฑๅฏใฎใชใ้ฒ้ณใปๆฎๅฝฑใฏใ้ ๆ
ฎใใ ใใใ็ๆงใฎใๅๅใใใใใใ้กใใใใใพใใใ
Overview
This is a CLI tool for transcribing and summarizing audio data. It can also distinguish speakers and output the transcription separately for each speaker, useful for meeting minutes.
Requirement
- Deno >= 1.41.0
- Amazon S3 Bucket
- AWS IAM Access key pair
- OpenAI API key
Technologies
- Deno
- TypeScript
- Amazon S3
- Amazon Transcribe
System
- Transcriptor put audio data to S3.
- AmazonTranscribe read audio data from S3.
- AmazonTranscribe output transcription result to S3.(same bucket as the one where the audio data is stored.)
- Transcriptor get transcription result from S3.
- Transcriptor summarize transcription result via OpenAI API.
Usage
0. Install Deno and set the PATH
$ curl -fsSL https://deno.land/x/install/install.sh | sh
Add the location of the deno executable to the PATH variable. (e.g., ~/.bashrc, ~/.bash_profile, or ~/.zshrc)
export DENO_INSTALL="$HOME/.deno"
export PATH="$DENO_INSTALL/bin:$PATH"
1. Install transcriptor from deno.land/x
$ deno install --allow-env --allow-sys --allow-read --allow-net https://deno.land/x/transcriptor@v1.1.4/src/transcriptor.ts
2. Create .env file and fill in environment variables as per the following example.
$ touch .env
# or add the following if .env already exists.
AWS_ACCESS_KEY_ID="YOUR_AWS_IAM_ACCESS_KEY_ID"
AWS_SECRET_ACCESS_KEY="YOUR_AWS_IAM_SECRET_ACCESS_KEY"
AWS_REGION="your-aws-region"
TRANSCRIPTOR_S3_BUCKET_NAME='your-s3-bucket-name'
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"
OPENAI_GPT_MODEL="your-prefer-gpt-model(ex. gpt-3.5-turbo)"
TRANSCRIPTOR_SYSTEM_PROMPT="system prompt for summarizing with GPT"
3. Run script like below.
transcriptor --file='path/to/your/audio/data/to/summarize
Argument options
--lang='ja-JP'
// The language spoken in the audio file.
// default: 'ja-JP'
--model='gpt-3.5-turbo'
// The name of the GPT model used for summarizing.
// default: 'gpt-3.5-turbo'
--speakerCount='4'
// The number of speakers in the audio data.
// default: 1