Skip to content

Commit cd0e54e

Browse files
committed
Feature: Implement LibriVox API import with full data, language conversion, image fix, and layout fix
1 parent 8df13e7 commit cd0e54e

10 files changed

Lines changed: 594 additions & 187 deletions

File tree

PROGRESS_LOG.md

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The project was recently migrated from a restrictive host (20i.com) to a more ro
1010
**Challenge:**
1111
The current process for importing ~20,000 audiobooks is manual, unreliable, and slow. It relies on an Artisan command that can only process ~100 items at a time before timing out or failing. This requires constant manual intervention.
1212

13-
Previous attempts to import from the **Librivox** and **Internet Archive** APIs were problematic due to the restrictive hosting environment, leading to complex workarounds. Specific issues were encountered with inconsistent or missing metadata such as **tags, narrators, and languages**, making the import logic fragile.
13+
Previous attempts to import from the **Librivox** and **Internet Archive** APIs were problematic due1 to the restrictive hosting environment, leading to complex workarounds. Specific issues were encountered with inconsistent or missing metadata such as **tags, narrators, and languages**, making the import logic fragile.
1414

1515
**Goal:**
1616
Architect and implement a robust, automated, and scalable solution for importing all audiobooks without manual oversight or server timeouts.
@@ -21,4 +21,70 @@ Architect and implement a robust, automated, and scalable solution for importing
2121
3. A queue worker will be configured (using the `database` or `redis` driver) and set up to run persistently on the server via Ploi to process these jobs from the queue in the background.
2222

2323
**Next Steps:**
24-
Begin implementation of the queue-based import system, starting with configuring the queue driver and creating the `ImportAudiobook` Job.
24+
Begin implementation of the queue-based import system, starting with configuring the queue driver and creating the `ImportAudiobook` Job.
25+
26+
### Session: 2025-07-03
27+
28+
**Project:** Librostream.com
29+
30+
**Summary:**
31+
Focused on resolving critical deployment and application errors on the new Digital Ocean + Ploi.io stack, automating audiobook imports, and fixing homepage data counters.
32+
33+
**Work Completed:**
34+
1. **Initial HTTP ERROR 500 Resolution:** Diagnosed and resolved the `HTTP ERROR 500` on the live site. The issue was traced to incorrect file permissions on `storage` and `bootstrap/cache` directories. Resolved by setting ownership to `librostream-3odwm:librostream-3odwm` and permissions to `775`.
35+
2. **Homepage Counter Fix:** Modified `app/Http/Controllers/AudiobookController.php` to correctly count audiobooks by filtering for non-null slugs, ensuring the homepage counter reflects only visible audiobooks. This fix was verified on localhost.
36+
3. **Audiobook Import Automation Commands Created:**
37+
* `app/Console/Commands/FullLibriVoxImport.php`: New Artisan command to continuously fetch main audiobook data from LibriVox (Archive.org) using pagination.
38+
* `app/Console/Commands/FullLibriVoxSectionsImport.php`: New Artisan command to continuously fetch and store sections for existing audiobooks.
39+
40+
**Current Challenges & Status:**
41+
1. **Persistent Deployment Issues with Ploi.io:**
42+
* **Git Permission Denied:** `error: could not lock config file .git/config: Permission denied` during `git pull`.
43+
* **Chmod Script Error:** `chmod: cannot access 'ploi-e23888159c3a6d2c4ad47fa41f7116d3.sh': No such file or directory`.
44+
* **Webhook Failure:** GitHub Actions successfully triggers the Ploi.io deploy webhook, but Ploi.io does not initiate or log any deployment activity.
45+
* **Server Access Limitations:** The `ploi` SSH user lacks `sudo` privileges, and Ploi.io's dashboard does not provide a file manager or a way to run commands as `root` or `librostream-3odwm`, preventing manual resolution of permission issues after deployments.
46+
* **Ploi.io Support Unresponsive:** Previous attempts to get support for these deployment issues have been unsuccessful.
47+
48+
### Session: 2025-07-03 (Continued)
49+
50+
**Summary:**
51+
Re-assessed audiobook import strategy following user feedback on previous import failures and preference for a "slow but complete" synchronous import, including all associated data (sections) at once, and automation via cron jobs. Addressed issues with messy language codes and missing images/audio files. Decided to **start from scratch** by switching the import source from Internet Archive API to the official LibriVox API to ensure cleaner data.
52+
53+
**Problem:**
54+
Previous import attempts resulted in system crashes and corrupted/incomplete data. The user prefers a single, synchronous import process per audiobook that includes all its sections, rather than separate import steps or a queue-based system. Automation via cron jobs is also required. Additionally, language data from the Internet Archive API was in ISO codes (e.g., `cat`, `mul`, `deu`, `ita`) instead of full language names, and some images/audio files were not present. The core issue was identified as data quality from the Internet Archive API, leading to the decision to switch to the official LibriVox API.
55+
56+
**Revised Plan for Robust, Complete, and Automated Audiobook Import (using Official LibriVox API) - Incorporating Advisor Feedback:**
57+
58+
1. **Database Reset (Crucial User Action)**:
59+
* **Action Required by User**: Before any new import, all existing audiobook, audiobook section, and category data in the database **must be cleared**. This ensures a clean slate, preventing conflicts or corruption from previous, problematic imports. This is a manual step (e.g., truncating tables).
60+
61+
2. **Create `config/languages.php`**:
62+
* **Action by Cline**: Created a new configuration file (`config/languages.php`) to centrally store ISO language code to full name mappings. (Completed)
63+
64+
3. **Reconfigure `app/Services/LibriVoxService.php`**:
65+
* **Action by Cline**: Updated `baseUrl` for audiobooks to `https://librivox.org/api/feed/audiobooks` and `itemApiBaseUrl` to `https://librivox.org/api/feed/audiotracks`.
66+
* **Action by Cline**: Modified `fetchAudiobooks` to request `format=json`, `extended=1`, and `coverart=1`, and parse the JSON response.
67+
* **Action by Cline**: Modified `fetchAudiobookFiles` (renamed to `fetchAudiobookTracks` for clarity) to use the `audiotracks` endpoint with `project_id` and parse its JSON response for section details.
68+
* **Enhancement (Error Handling)**: Added robust `try/catch` blocks around all API calls within this service with detailed logging of failures. (Completed)
69+
70+
4. **Revise `app/Console/Commands/FetchLibriVoxAudiobooks.php`**:
71+
* **Action by Cline**: Updated to consume JSON data from the reconfigured `LibriVoxService`.
72+
* **Action by Cline**: Adjusted data mapping logic for LibriVox API's response structure, including `cover_image` and `librivox_id`.
73+
* **Enhancement (Dry Run)**: Implemented a `--dry-run` flag to simulate imports without database writes.
74+
* **Enhancement (Language Code Mapping)**: Integrated the `config/languages.php` for language conversion.
75+
* **Enhancement (Error Handling + Logging)**: Added robust `try/catch` blocks around database write operations and logging of failures per audiobook ID.
76+
* **Action by Cline**: Ensured synchronous, complete import for each audiobook (main data + sections from respective LibriVox API endpoints).
77+
* **Enhancement (Progress Feedback)**: Maintained and enhanced progress feedback.
78+
* **Fix (Broken Images)**: Prioritized `coverart_jpg` and `coverart_thumbnail` fields from LibriVox API for `cover_image` URL. (Completed)
79+
80+
5. **Update `PROGRESS_LOG.md`**:
81+
* **Action by Cline**: Updated this log to reflect the complete shift to the official LibriVox API and all incorporated enhancements. (Completed)
82+
83+
**Current Status & Remaining Issues:**
84+
* **Images and Layout**: Confirmed by user that images are now working correctly on audiobook detail pages and the layout issue is resolved.
85+
* **"Märchen (Index aller Märchen) (LibriVox ID: 66)"**: This book still shows "No audio tracks found for sections". User clarified this is an index page on LibriVox that links to other audiobooks, not a directly playable audiobook. (Decision on how to handle this deferred).
86+
87+
**Next Steps (Requires User Action for Deployment & Full Automation):**
88+
1. **Deployment**: User confirmed migration to Laravel Forge and resolution of deployment issues. The next step is to deploy the current code changes to the live server.
89+
2. **Verify Live Site**: After successful deployment, confirm the homepage counters are correct and the "Hello World" test marker is visible.
90+
3. **Set up Cron Job in Forge**: Configure the `librivox:full-import` command as a scheduled task within Laravel Forge to automate the full import process.
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
<?php
2+
3+
namespace App\Console\Commands;
4+
5+
use App\Models\Audiobook;
6+
use Illuminate\Console\Command;
7+
use Illuminate\Support\Str;
8+
9+
class BackfillAudiobookSlugs extends Command
10+
{
11+
/**
12+
* The name and signature of the console command.
13+
*
14+
* @var string
15+
*/
16+
protected $signature = 'audiobooks:backfill-slugs';
17+
18+
/**
19+
* The console command description.
20+
*
21+
* @var string
22+
*/
23+
protected $description = 'Backfills missing slugs for existing audiobooks.';
24+
25+
/**
26+
* Execute the console command.
27+
*
28+
* @return int
29+
*/
30+
public function handle()
31+
{
32+
$this->info('Starting backfill of missing audiobook slugs...');
33+
34+
$audiobooksWithoutSlugs = Audiobook::whereNull('slug')->get();
35+
$count = $audiobooksWithoutSlugs->count();
36+
37+
if ($count === 0) {
38+
$this->info('No audiobooks found with missing slugs. Nothing to backfill.');
39+
return Command::SUCCESS;
40+
}
41+
42+
$this->info("Found {$count} audiobooks with missing slugs. Processing...");
43+
44+
$progressBar = $this->output->createProgressBar($count);
45+
$progressBar->start();
46+
47+
$updatedCount = 0;
48+
49+
foreach ($audiobooksWithoutSlugs as $book) {
50+
$baseSlug = Str::slug($book->title);
51+
$slug = $baseSlug;
52+
$counter = 1;
53+
54+
// Ensure uniqueness for the new slug
55+
while (Audiobook::where('slug', $slug)->where('id', '!=', $book->id)->exists()) {
56+
$slug = $baseSlug . '-' . $counter++;
57+
}
58+
59+
$book->slug = $slug;
60+
$book->save();
61+
$updatedCount++;
62+
$progressBar->advance();
63+
}
64+
65+
$progressBar->finish();
66+
$this->info("\nBackfill complete. {$updatedCount} slugs updated.");
67+
68+
return Command::SUCCESS;
69+
}
70+
}

0 commit comments

Comments
 (0)