Skip to main content
screenpipe automatically transcribes all audio from your meetings, calls, and conversations. everything runs locally using Whisper.
for the full botless meeting workflow - live transcript, speaker cleanup, calendar enrichment, summaries, copy transcript, and APIs - see meeting intelligence.

setup

audio recording is enabled by default in the desktop app. configure audio devices and transcription engine in settings.
  • audio devices: select which microphones and system audio to capture
  • transcription engine: choose between local Whisper (private) or Deepgram (faster, cloud)

search transcriptions

# find discussions about a topic
curl "http://localhost:3030/search?q=budget+review&content_type=audio&limit=10"

# get today's meetings
curl "http://localhost:3030/search?content_type=audio&start_time=2026-02-11T00:00:00Z"

# filter by speaker
curl "http://localhost:3030/search?content_type=audio&speaker_ids=1,2"
curl "http://localhost:3030/search?content_type=audio&speaker_name=John"

speaker identification

screenpipe automatically identifies different speakers. manage them via API:

improve speaker identification with calendar

connect your Google Calendar to significantly improve speaker identification accuracy. screenpipe uses your calendar’s attendee list to automatically name speakers during meetings — if a meeting has exactly 2 attendees, the other speaker is auto-identified without manual labeling. to enable this:
  1. go to settings → connections → Google Calendar
  2. authorize screenpipe to access your calendar
  3. during future meetings, attendee names from your calendar will automatically label speakers
this works best for 1:1 meetings and structured calls. for larger meetings (3+ attendees), calendar context is tagged to your notes for later reference.
# get unnamed speakers for labeling
curl "http://localhost:3030/speakers/unnamed?limit=10"

# update a speaker's name
curl -X POST http://localhost:3030/speakers/update \
  -H "Content-Type: application/json" \
  -d '{"id": 1, "name": "John Smith"}'

# search speakers by name
curl "http://localhost:3030/speakers/search?name=john"

# merge duplicate speakers
curl -X POST http://localhost:3030/speakers/merge \
  -H "Content-Type: application/json" \
  -d '{"speaker_to_keep_id": 1, "speaker_to_merge_id": 2}'

# find similar speakers
curl "http://localhost:3030/speakers/similar?speaker_id=1"

tips

  • use a good microphone
  • reduce background noise
  • whisper-large-v3-turbo gives best accuracy
  • set language to English in settings if you only speak English (faster)

long meetings and batch sizing

by default, screenpipe batches audio for transcription in chunks:
  • Whisper/OpenAI: 600 seconds (10 minutes)
  • Deepgram: up to 5000 seconds (83 minutes)
if you notice meetings longer than one hour losing context between batches, you can customize the batch size in settings > advanced > batch_max_duration_secs. set to your meeting’s typical duration to preserve context across the entire recording. in smart/batch transcription mode, large meetings may be split across multiple transcription jobs. if you need full meeting context in a single batch, consider:
  • switching to realtime transcription (transcription happens immediately as audio is captured, trading cost/latency for guaranteed continuity)
  • increasing batch_max_duration_secs to match your meeting length (supported up to engine limits: 5000s for Deepgram, 3000s for OpenAI)
  • using retranscription API to re-process a full meeting with custom settings

privacy

  • all transcription runs locally on your device
  • audio files stored in ~/.screenpipe/data/
  • no audio sent to cloud unless you choose deepgram
  • disable audio recording in app settings
questions? join our discord.