add simple running instructions

2025-06-05 21:49:11 +02:00 · 2024-04-23 19:07:24 -05:00
parent 1850da9210
commit 9fb6d948d0
2 changed files with 14 additions and 5 deletions
--- a/README.md
+++ b/README.md
@@ -13,6 +13,8 @@ There are three ways (besides running Gradio in Colab):
 1. More flexible inference beyond Gradio UI in Google Colab. see [quickstart colab](#quickstart-colab)
 2. with docker. see [quickstart docker](#quickstart-docker)
 3. without docker. see [environment setup](#environment-setup). You can also run gradio locally if you choose this option
 4. As a standalone script that you can easily integrate into other projects.
 see [quickstart command line](#quickstart-command-line).
 When you are inside the docker image or you have installed all dependencies, Checkout [`inference_tts.ipynb`](./inference_tts.ipynb).
@@ -21,7 +23,7 @@ If you want to do model development such as training/finetuning, I recommend fol
 ## News
 :star: 04/22/2024: 330M/830M TTS Enhanced Models are up [here](https://huggingface.co/pyp1), load them through [`gradio_app.py`](./gradio_app.py) or [`inference_tts.ipynb`](./inference_tts.ipynb)! Replicate demo is up, major thanks to [@chenxwh](https://github.com/chenxwh)!
-:star: 04/11/2024: VoiceCraft Gradio is now available on HuggingFace Spaces [here](https://huggingface.co/spaces/pyp1/VoiceCraft_gradio)! Major thanks to [@zuev-stepan](https://github.com/zuev-stepan), [@Sewlell](https://github.com/Sewlell), [@pgsoar](https://github.com/pgosar) [@Ph0rk0z](https://github.com/Ph0rk0z). 
+:star: 04/11/2024: VoiceCraft Gradio is now available on HuggingFace Spaces [here](https://huggingface.co/spaces/pyp1/VoiceCraft_gradio)! Major thanks to [@zuev-stepan](https://github.com/zuev-stepan), [@Sewlell](https://github.com/Sewlell), [@pgsoar](https://github.com/pgosar) [@Ph0rk0z](https://github.com/Ph0rk0z).
 :star: 04/05/2024: I finetuned giga330M with the TTS objective on gigaspeech and 1/5 of librilight. Weights are [here](https://huggingface.co/pyp1/VoiceCraft/tree/main). Make sure maximal prompt + generation length <= 16 seconds (due to our limited compute, we had to drop utterances longer than 16s in training data). Even stronger models forthcomming, stay tuned!
@@ -37,11 +39,9 @@ If you want to do model development such as training/finetuning, I recommend fol
 - [x] Better guidance on training/finetuning
 - [x] Colab notebooks
 - [x] HuggingFace Spaces demo
- [ ] Command line
+- [x] Command line
 - [ ] Improve efficiency
 ## QuickStart Colab
 :star: To try out speech editing or TTS Inference with VoiceCraft, the simplest way is using Google Colab.
@@ -50,6 +50,15 @@ Instructions to run are on the Colab itself.
 1. To try [Speech Editing](https://colab.research.google.com/drive/1FV7EC36dl8UioePY1xXijXTMl7X47kR_?usp=sharing)
 2. To try [TTS Inference](https://colab.research.google.com/drive/1lch_6it5-JpXgAQlUTRRI2z2_rk5K67Z?usp=sharing)
 ## QuickStart Command Line
 :star: To use it as a standalone script, check out tts_demo.py and speech_editing_demo.py.
 Be sure to first [setup your environment](#environment-setup).
 Without arguments, they will run the standard demo arguments used as an example elsewhere
 in this repository. You can use the command line arguments to specify unique input audios,
 target transcripts, and inference hyperparameters. Run the help command for more information:
 `python3 tts_demo.py -h` and `python3 speech-editing_demo.py -h`
 ## QuickStart Docker
 :star: To try out TTS inference with VoiceCraft, you can also use docker. Thank [@ubergarm](https://github.com/ubergarm) and [@jayc88](https://github.com/jay-c88) for making this happen.
@@ -197,7 +206,7 @@ cd ./z_scripts
 bash e830M.sh
 ```
-It's the same procedure to prepare your own custom dataset. Make sure that if 
+It's the same procedure to prepare your own custom dataset. Make sure that if
 ## Finetuning
 You also need to do step 1-4 as Training, and I recommend to use AdamW for optimization if you finetune a pretrained model for better stability. checkout script `./z_scripts/e830M_ft.sh`.
--- a/inference_demo.py
+++ b/inference_demo.py