- Document working REST API configuration (vs broken GraphQL) - Add endpoint IDs for GRACE, PENNY, FACTORY - Include troubleshooting for workers not starting - Document Docker image rebuild process
3.8 KiB
3.8 KiB
RunPod Serverless - GPU Services
Resumen
RunPod Serverless proporciona GPUs on-demand para los servicios GRACE, PENNY y FACTORY.
Cuenta: rpd@tzr.systems Balance: ~$69 USD Cuota máxima: 5 workers simultáneos
Endpoints Activos
| Servicio | Endpoint ID | Workers Max | Módulos |
|---|---|---|---|
| GRACE | rfltzijgn1jno4 |
2 | ASR, OCR, TTS, Face, Embeddings, Avatar |
| PENNY | zsu7eah0fo7xt6 |
2 | TTS (voz) |
| FACTORY | hffu4q5pywjzng |
1 | Embeddings, procesamiento |
GPUs soportadas: RTX 3090, RTX 4090, RTX A4000, NVIDIA L4
Configuración Crítica
Problema Resuelto (2025-12-27)
Los workers serverless no iniciaban usando la API GraphQL. La solución fue usar la REST API con parámetros específicos.
API Correcta: REST (no GraphQL)
# CORRECTO - REST API
curl -X POST "https://rest.runpod.io/v1/endpoints" \
-H "Authorization: Bearer $RUNPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "mi-endpoint",
"templateId": "TEMPLATE_ID",
"gpuTypeIds": ["NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 4090"],
"scalerType": "QUEUE_DELAY",
"scalerValue": 4,
"workersMin": 0,
"workersMax": 2,
"idleTimeout": 60,
"executionTimeoutMs": 600000,
"flashboot": true
}'
Campos obligatorios:
gpuTypeIds: Array de strings (NO string separado por comas)scalerType: "QUEUE_DELAY"scalerValue: 4 (segundos de delay antes de escalar)flashboot: true (arranque rápido)
Uso de la API
Enviar Job
curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/run" \
-H "Authorization: $RUNPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"input": {"module": "ASR_ENGINE", "audio_base64": "..."}}'
Verificar Estado
curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/status/{JOB_ID}" \
-H "Authorization: $RUNPOD_API_KEY"
Health Check
curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/health" \
-H "Authorization: $RUNPOD_API_KEY"
Respuesta esperada:
{
"jobs": {"completed": 5, "failed": 0, "inQueue": 0},
"workers": {"idle": 1, "ready": 1, "running": 0}
}
Módulos GRACE
| Módulo | Descripción | Modelo |
|---|---|---|
ASR_ENGINE |
Speech-to-Text | Faster Whisper Large V3 |
OCR_CORE |
Reconocimiento de texto | GOT-OCR 2.0 |
TTS |
Text-to-Speech | XTTS-v2 |
FACE_VECTOR |
Vectores faciales | InsightFace |
EMBEDDINGS |
Embeddings de texto | BGE-Large |
AVATAR_GEN |
Generación de avatares | SDXL |
Docker Image
Registry temporal: ttl.sh/tzzr-grace:24h (expira en 24h)
Rebuild
# En servidor deck (72.62.1.113)
cd /tmp/docker-grace
docker build -t ttl.sh/tzzr-grace:24h .
docker push ttl.sh/tzzr-grace:24h
Dockerfile clave
FROM runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04
# Fix blinker antes de requirements
RUN pip install --no-cache-dir --ignore-installed blinker
RUN pip install --no-cache-dir -r requirements.txt
Troubleshooting
Workers no inician (0 workers)
- Usar REST API, no GraphQL
- Verificar
gpuTypeIdses array - Incluir
flashboot: true - Verificar cuota de workers no excedida
Job en cola indefinidamente
# Purgar cola
curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/purge-queue" \
-H "Authorization: $RUNPOD_API_KEY"
Error de modelo (TOS)
Agregar variable de entorno: COQUI_TOS_AGREED=1
Credenciales
Ubicación: /home/orchestrator/.secrets/runpod_api_key
export RUNPOD_API_KEY=$(cat ~/.secrets/runpod_api_key)