docs: Add RunPod Serverless configuration and troubleshooting
- Document working REST API configuration (vs broken GraphQL) - Add endpoint IDs for GRACE, PENNY, FACTORY - Include troubleshooting for workers not starting - Document Docker image rebuild process
This commit is contained in:
169
05_OPERACIONES/runpod-serverless.md
Normal file
169
05_OPERACIONES/runpod-serverless.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# RunPod Serverless - GPU Services
|
||||
|
||||
## Resumen
|
||||
|
||||
RunPod Serverless proporciona GPUs on-demand para los servicios GRACE, PENNY y FACTORY.
|
||||
|
||||
**Cuenta**: rpd@tzr.systems
|
||||
**Balance**: ~$69 USD
|
||||
**Cuota máxima**: 5 workers simultáneos
|
||||
|
||||
---
|
||||
|
||||
## Endpoints Activos
|
||||
|
||||
| Servicio | Endpoint ID | Workers Max | Módulos |
|
||||
|----------|-------------|-------------|---------|
|
||||
| **GRACE** | `rfltzijgn1jno4` | 2 | ASR, OCR, TTS, Face, Embeddings, Avatar |
|
||||
| **PENNY** | `zsu7eah0fo7xt6` | 2 | TTS (voz) |
|
||||
| **FACTORY** | `hffu4q5pywjzng` | 1 | Embeddings, procesamiento |
|
||||
|
||||
**GPUs soportadas**: RTX 3090, RTX 4090, RTX A4000, NVIDIA L4
|
||||
|
||||
---
|
||||
|
||||
## Configuración Crítica
|
||||
|
||||
### Problema Resuelto (2025-12-27)
|
||||
|
||||
Los workers serverless no iniciaban usando la API GraphQL. La solución fue usar la **REST API** con parámetros específicos.
|
||||
|
||||
### API Correcta: REST (no GraphQL)
|
||||
|
||||
```bash
|
||||
# CORRECTO - REST API
|
||||
curl -X POST "https://rest.runpod.io/v1/endpoints" \
|
||||
-H "Authorization: Bearer $RUNPOD_API_KEY" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"name": "mi-endpoint",
|
||||
"templateId": "TEMPLATE_ID",
|
||||
"gpuTypeIds": ["NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 4090"],
|
||||
"scalerType": "QUEUE_DELAY",
|
||||
"scalerValue": 4,
|
||||
"workersMin": 0,
|
||||
"workersMax": 2,
|
||||
"idleTimeout": 60,
|
||||
"executionTimeoutMs": 600000,
|
||||
"flashboot": true
|
||||
}'
|
||||
```
|
||||
|
||||
**Campos obligatorios**:
|
||||
- `gpuTypeIds`: Array de strings (NO string separado por comas)
|
||||
- `scalerType`: "QUEUE_DELAY"
|
||||
- `scalerValue`: 4 (segundos de delay antes de escalar)
|
||||
- `flashboot`: true (arranque rápido)
|
||||
|
||||
---
|
||||
|
||||
## Uso de la API
|
||||
|
||||
### Enviar Job
|
||||
|
||||
```bash
|
||||
curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/run" \
|
||||
-H "Authorization: $RUNPOD_API_KEY" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"input": {"module": "ASR_ENGINE", "audio_base64": "..."}}'
|
||||
```
|
||||
|
||||
### Verificar Estado
|
||||
|
||||
```bash
|
||||
curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/status/{JOB_ID}" \
|
||||
-H "Authorization: $RUNPOD_API_KEY"
|
||||
```
|
||||
|
||||
### Health Check
|
||||
|
||||
```bash
|
||||
curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/health" \
|
||||
-H "Authorization: $RUNPOD_API_KEY"
|
||||
```
|
||||
|
||||
Respuesta esperada:
|
||||
```json
|
||||
{
|
||||
"jobs": {"completed": 5, "failed": 0, "inQueue": 0},
|
||||
"workers": {"idle": 1, "ready": 1, "running": 0}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Módulos GRACE
|
||||
|
||||
| Módulo | Descripción | Modelo |
|
||||
|--------|-------------|--------|
|
||||
| `ASR_ENGINE` | Speech-to-Text | Faster Whisper Large V3 |
|
||||
| `OCR_CORE` | Reconocimiento de texto | GOT-OCR 2.0 |
|
||||
| `TTS` | Text-to-Speech | XTTS-v2 |
|
||||
| `FACE_VECTOR` | Vectores faciales | InsightFace |
|
||||
| `EMBEDDINGS` | Embeddings de texto | BGE-Large |
|
||||
| `AVATAR_GEN` | Generación de avatares | SDXL |
|
||||
|
||||
---
|
||||
|
||||
## Docker Image
|
||||
|
||||
**Registry temporal**: `ttl.sh/tzzr-grace:24h` (expira en 24h)
|
||||
|
||||
### Rebuild
|
||||
|
||||
```bash
|
||||
# En servidor deck (72.62.1.113)
|
||||
cd /tmp/docker-grace
|
||||
docker build -t ttl.sh/tzzr-grace:24h .
|
||||
docker push ttl.sh/tzzr-grace:24h
|
||||
```
|
||||
|
||||
### Dockerfile clave
|
||||
|
||||
```dockerfile
|
||||
FROM runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04
|
||||
# Fix blinker antes de requirements
|
||||
RUN pip install --no-cache-dir --ignore-installed blinker
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Workers no inician (0 workers)
|
||||
|
||||
1. Usar REST API, no GraphQL
|
||||
2. Verificar `gpuTypeIds` es array
|
||||
3. Incluir `flashboot: true`
|
||||
4. Verificar cuota de workers no excedida
|
||||
|
||||
### Job en cola indefinidamente
|
||||
|
||||
```bash
|
||||
# Purgar cola
|
||||
curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/purge-queue" \
|
||||
-H "Authorization: $RUNPOD_API_KEY"
|
||||
```
|
||||
|
||||
### Error de modelo (TOS)
|
||||
|
||||
Agregar variable de entorno: `COQUI_TOS_AGREED=1`
|
||||
|
||||
---
|
||||
|
||||
## Credenciales
|
||||
|
||||
Ubicación: `/home/orchestrator/.secrets/runpod_api_key`
|
||||
|
||||
```bash
|
||||
export RUNPOD_API_KEY=$(cat ~/.secrets/runpod_api_key)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Referencias
|
||||
|
||||
- [RunPod REST API](https://docs.runpod.io/api-reference)
|
||||
- [Serverless Endpoints](https://docs.runpod.io/serverless/endpoints/manage-endpoints)
|
||||
- [Status Page](https://uptime.runpod.io)
|
||||
Reference in New Issue
Block a user