LogoAI ForensiX

Audio Deepfake Detection

Response schema and example outputs for AI-ForensiX audio deepfake detection model.

AI-ForensiX Audio Deepfake Detection model evaluates acoustic, frequency, and speech patterns to determine whether an audio clip is real, synthetic, or manipulated.
The model provides prediction labels, confidence scores, explainability heatmaps, and detailed manipulation type classification.


Response Schema

AudioDeepfakeDetectionResult

FieldTypeDescription
labelstring ("real" | "fake")Indicates whether the audio is authentic or manipulated/synthetic.
scorenumber (0.0 – 1.0)Confidence score of the prediction based on acoustic analysis.
heatmap_urlstring (URL) - optionalURL to spectrogram-based heatmap highlighting regions contributing to the decision.
sourcestring ("real" | "replay" | "tts" | "voice clone" | "voice conversion" | "post processing")Identifies the type of manipulation or authenticity.

Source Classification Explanation

ValueMeaning
realAudio captured from a genuine human speaker.
replayReplay attack detected (re-recorded authentic audio).
ttsText-to-speech synthesized audio.
voice cloneAI-generated cloned voice mimicking a specific speaker.
voice conversionSpeaker identity transformed using voice conversion techniques.
post processingEdited, spliced, or digitally manipulated audio.

Example Responses


Listing 5: Fake Audio Detection Example

{
  "label": "fake",
  "score": 0.912,
  "heatmap_url": "https:https://forensiX.com/.png,
  "source": "voice_clone"
}

Listing 5: Real Audio Detection Example

{
  "label": "real",
  "score": 0.987,
  "source": "real"
}

On this page