1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
const axios = require('axios');
const FormData = require('form-data');
const api_key = "YOUR API-KEY";
const url = "https://api.segmind.com/v1/minimax-music-01";
const reqBody = {
"lyrics": "[verse] \n In the silence, I hear your name \n Echoes of love that still remain \n Fading lights and midnight rain \n I’m lost in yesterday again \n [chorus] \n But I still dream, I still try \n To hold your ghost beneath the sky ",
"bitrate": 256000,
"song_file": "https://replicate.delivery/pbxt/M9zum1Y6qujy02jeigHTJzn0lBTQOemB7OkH5XmmPSC5OUoO/MiniMax-Electronic.wav",
"sample_rate": 44100
};
(async function() {
try {
const formData = new FormData();
// Append regular fields
for (const key in reqBody) {
if (reqBody.hasOwnProperty(key)) {
formData.append(key, reqBody[key]);
}
}
// Convert and append images as Base64 if necessary
const response = await axios.post(url, formData, {
headers: {
'x-api-key': api_key,
...formData.getHeaders()
}
});
console.log(response.data);
} catch (error) {
console.error('Error:', error.response ? error.response.data : error.message);
}
})();
Format your lyrics using newlines to separate each line. Use two newlines to add a pause. Wrap the lyrics with ## to include accompaniment. Max 400 characters.
An enumeration.
Allowed values:
Previously used voice ID
Reference song. Should include both music and vocals; Supported formats: wav or .mp3; Minimum length: 15 seconds
Reference file for voice; Must be a .wav or .mp3 file longer than 15 seconds. If only a voice reference is given, an a cappella vocal hum will be generated.
Sample rate for the output file.
Allowed values:
Previously used instrumental ID
Instrumental reference. Must be a .wav or .mp3 file longer than 15 seconds. If only an instrumental reference is given, a track without vocals will be generated.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
MiniMax Music-01 is a cutting-edge AI music generation model designed to simplify and accelerate the music creation process. It enables the simultaneous generation of accompaniment and vocals, making it a powerful tool for musicians, composers, content creators, and developers working in creative fields. By analyzing reference tracks using deep learning, Music-01 can replicate musical styles, rhythms, and vocal characteristics to produce entirely new pieces of music from user-provided lyrics.
One of the standout features of Music-01 is its ability to handle a wide range of musical genres—from classical and pop to rock, electronic, and beyond. Users can upload a reference track that defines the desired musical style, input their own lyrics, and the model will synthesize a new music piece that reflects those stylistic attributes. This capability makes the model incredibly versatile and accessible to both professionals and hobbyists.
Additionally, the model supports lyrics-to-music generation, allowing users to skip the complexities of composition, recording, and arranging. Music-01 also accommodates emotional nuances, capturing vocal tone and style from the reference input to enhance authenticity in the final output.
Music-01 is well-suited for a wide range of applications:
While the model offers remarkable capabilities, it currently has a few limitations. The maximum output length is restricted to 60 seconds, which may not be suitable for full-length compositions just yet. However, an upcoming release promises to extend this limit to 3 minutes. Another limitation is the requirement for a reference track, which is essential for the model to learn and reproduce a specific style.
MiniMax Music-01 is a powerful tool that pushes the boundaries of AI-assisted music creation. Its intuitive workflow—reference upload, lyrics input, and music generation—makes it accessible to creators across experience levels. While it's still evolving in terms of output duration and independence from references, it’s already a compelling solution for fast, high-quality, multi-style music generation.