Compare commits

7 Commits

Author SHA1 Message Date
fdb7cf7a4a Some human touches (make clearer this' vibe coded, better env vars, etc)
All checks were successful
Cargo Build & Test / Rust project - latest (1.90) (push) Successful in 1m51s
2025-09-22 20:40:17 -04:00
3497cbaa6e Upload example config toml 2025-09-22 20:33:36 -04:00
129d67bc8b Don't use the same API key for other organizations we are pulling from
```EOF
For the organizations list, I am trying use my test instance, but getting the following in the logs;
```
2025-09-23T00:12:38.638052Z  INFO gitea_mirror: Fetching repositories from organization: https://gitea.hak8or.com/mirrors
2025-09-23T00:12:38.638081Z  INFO fetch_org_repos{org_url="https://gitea.hak8or.com/mirrors"}: gitea_mirror: Querying API endpoint: https://gitea.hak8or.com/api/v1/users/mirrors/repos
2025-09-23T00:12:38.653694Z ERROR gitea_mirror: Failed to fetch repos from https://gitea.hak8or.com/mirrors: HTTP status client error (401 Unauthorized) for url (https://gitea.hak8or.com/api/v1/users/mirrors/repos?page=1)
2025-09-23T00:12:38.653713Z  INFO gitea_mirror: Gitea mirror process completed.
```

I don't have a user with that key for the instance. Can you add the ability to provide an api key to each organization entry in the toml config? At the same time, is it possible to get a list of all repos from an organization without needing to use an api key? If so, when no api key is provided, can you use that?
```EOF
2025-09-22 20:29:14 -04:00
f732535db2 Re-create this with the canvas option in Gemini 2.5 Pro web chat
```EOF
Create a very minimal and simple tool written in rust which takes in a list of git URLs, and using the gitea api checks if the remote is already mirrored, and if not, then create a repo migration to gitea. I want to basically create a script which can be used to ensure a list of git repos are mirrord to a gitea server.

 The script should take in some command line arguments for;
  - an option to do a dry run, meaning do the check if the repo has to be mirrord, but do not initiate the actual migration
 - path to a TOML configuration file (also can be supplied via an ENV variable)

 The configuration file would have the following information;
   - an API key to be used when talking to the gitea instance we are migrating to
  - the url of the above gitea instance
  - a list of git URLs including an optional rename of the repo name
  - a list of URLs of another git server (gitea, if the API is the same then github, gitlab, etc) that includes the organization name or username. You would clone all repos under that organization/username. For example "https://github.com/hak8or" would be all repos owned by hak8or.

Example toml file;
```
gitea_url = "https://gitmirror.hak8or.com"

api_key = "api_key_goes_here"

repos = [
	{ url = "https://gitea.hak8or.com/hak8or/gitea_mirror.git" },
	{ rename = "cool_rename", url = "https://gitea.hak8or.com/hak8or/gitea_mirror.git" },
	{ rename = "cool_another_rename", url = "https://gitea.hak8or.com/hak8or/gitea_mirror.git" },
	{ rename = "rusty_rust", url = "https://github.com/rust-lang/rust.git" },
]
```

Ensure the script is as minimal as possible, do not use libraries if you can avoid them (except clap for CLI arguments, tracing for logging, actix for async and web interactions, reqwest for actual queries, and serde_json for json, or whatever else is commonly used in rust). I will be invoking this tool with a systemd timer.
```EOF
2025-09-22 20:28:35 -04:00
121387dbd2 Making mirrors be public by default (the heck Gemini ...)
All checks were successful
Cargo Build & Test / Rust project - latest (1.90) (push) Successful in 4m14s
2025-09-21 20:07:53 -04:00
0292577ff8 Remove stupid icons ... (friggen LLMs pulling in web dev habits ...) 2025-09-21 20:07:33 -04:00
9e63a0e3a8 Allow a rename for the repo name when ingesting remote URLs
Some checks failed
Cargo Build & Test / Rust project - latest (1.90) (push) Failing after 13m35s
2025-09-21 19:53:24 -04:00
4 changed files with 301 additions and 150 deletions

12
Cargo.lock generated
View File

@@ -1083,9 +1083,9 @@ dependencies = [
[[package]]
name = "serde"
version = "1.0.225"
version = "1.0.226"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fd6c24dee235d0da097043389623fb913daddf92c76e9f5a1db88607a0bcbd1d"
checksum = "0dca6411025b24b60bfa7ec1fe1f8e710ac09782dca409ee8237ba74b51295fd"
dependencies = [
"serde_core",
"serde_derive",
@@ -1093,18 +1093,18 @@ dependencies = [
[[package]]
name = "serde_core"
version = "1.0.225"
version = "1.0.226"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "659356f9a0cb1e529b24c01e43ad2bdf520ec4ceaf83047b83ddcc2251f96383"
checksum = "ba2ba63999edb9dac981fb34b3e5c0d111a69b0924e253ed29d83f7c99e966a4"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.225"
version = "1.0.226"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0ea936adf78b1f766949a4977b91d2f5595825bd6ec079aa9543ad2685fc4516"
checksum = "8db53ae22f34573731bafa1db20f04027b2d25e02d8205921b569171699cdb33"
dependencies = [
"proc-macro2",
"quote",

View File

@@ -4,11 +4,11 @@ version = "0.1.0"
edition = "2024"
[dependencies]
clap = { version = "4.5", features = ["derive", "env"] }
reqwest = { version = "0.12.23", features = ["json"] }
clap = { version = "4.0", features = ["derive", "env"] }
tokio = { version = "1", features = ["full"] }
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
tokio = { version = "1.35", features = ["full"] }
toml = "0.9.7"
toml = "0.9"
tracing = "0.1"
tracing-subscriber = "0.3"
tracing-subscriber = "0.3"

17
example.toml Normal file
View File

@@ -0,0 +1,17 @@
# The base URL of your Gitea instance
gitea_url = "https://gitmirror.hak8or.com"
# Your Gitea API key (generate one from User Settings -> Applications)
api_key = "API_KEY_GOES_HERE"
# A list of remote git repositories to mirror.
repos = [
{ url = "https://gitea.hak8or.com/hak8or/gitea_mirror.git" },
{ rename = "cool_rename", url = "https://gitea.hak8or.com/hak8or/gitea_mirror.git" },
{ rename = "cool_another_rename", url = "https://gitea.hak8or.com/hak8or/gitea_mirror.git" },
{ url = "https://github.com/justcallmekoko/ESP32Marauder" }
]
organizations = [
{ url = "https://gitea.hak8or.com/mirrors" },
]

View File

@@ -1,139 +1,193 @@
use clap::Parser;
use reqwest::header::{AUTHORIZATION, HeaderMap, HeaderValue};
use serde::{Deserialize, Serialize};
use std::path::PathBuf;
use tracing::{debug, error, info, warn};
use serde::Deserialize;
use std::fs;
use std::path::{Path, PathBuf};
use tracing::{Level, error, info, instrument, warn};
use tracing_subscriber;
// Represents the command-line arguments.
#[derive(Parser, Debug)]
#[command(name = "gitea-mirror")]
#[command(about = "Ensures Git repositories are mirrored to Gitea, generated with Claude Opus 4.1")]
#[command(
about = "Ensures Git repositories are mirrored to Gitea, generated with Gemini 2.5 Web Canvas"
)]
#[clap(author, version, about, long_about = None)]
struct Args {
/// Path to TOML configuration file
#[arg(short, long, env = "GITEA_MIRROR_CONFIG_FILEPATH")]
/// Path to the TOML configuration file.
#[clap(short, long, value_parser, env = "GITEA_MIRROR_CONFIG_FILEPATH")]
config: PathBuf,
/// Dry run - check but don't create migrations
#[arg(short, long, default_value_t = false)]
/// Perform a dry run without creating any migrations.
#[clap(short, long, default_value_t = false)]
dry_run: bool,
}
// Represents a single repository entry in the config file.
#[derive(Deserialize, Debug, Clone)]
struct RepoConfig {
url: String,
rename: Option<String>,
}
// Represents a single organization entry in the config file.
#[derive(Deserialize, Debug, Clone)]
struct OrgConfig {
url: String,
api_key: Option<String>,
}
// Represents the main structure of the TOML configuration file.
#[derive(Deserialize, Debug)]
struct Config {
gitea_url: String,
api_key: String,
git_urls: Vec<String>,
repos: Option<Vec<RepoConfig>>,
organizations: Option<Vec<OrgConfig>>,
}
#[derive(Deserialize, Debug)]
struct Repository {
name: String,
mirror: bool,
original_url: Option<String>,
}
#[derive(Serialize)]
struct MigrateRepoRequest {
clone_addr: String,
repo_name: String,
// Represents the payload for creating a migration in Gitea.
#[derive(serde::Serialize, Debug)]
struct MigrateRepoPayload<'a> {
clone_addr: &'a str,
repo_name: &'a str,
mirror: bool,
private: bool,
description: String,
description: &'a str,
uid: i64, // The user ID of the owner. We'll fetch this.
}
// Represents a user as returned by the Gitea API.
#[derive(Deserialize, Debug)]
struct GiteaUser {
id: i64,
}
/// Entry point of the application.
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
tracing_subscriber::fmt::init();
// Initialize the tracing subscriber for logging.
tracing_subscriber::fmt().with_max_level(Level::INFO).init();
// Parse command-line arguments or get config path from environment variable.
let args = Args::parse();
// Load configuration
let config_content = std::fs::read_to_string(&args.config)?;
let config: Config = toml::from_str(&config_content)?;
info!("Starting Gitea mirror process. Dry run: {}", args.dry_run);
info!("Starting Gitea mirror sync");
info!("Dry run: {}", args.dry_run);
info!("Gitea URL: {}", config.gitea_url);
info!("Checking {} repositories", config.git_urls.len());
// Read and parse the configuration file.
let config = load_config(&args.config)?;
let http_client = reqwest::Client::new();
// Create HTTP client with auth header
let mut headers = HeaderMap::new();
headers.insert(
AUTHORIZATION,
HeaderValue::from_str(&format!("token {}", config.api_key))?,
// Fetch the Gitea user ID for the authenticated user.
let user_id = get_gitea_user_id(&http_client, &config.gitea_url, &config.api_key).await?;
info!(
"Successfully authenticated and retrieved user ID: {}",
user_id
);
let client = reqwest::Client::builder()
.default_headers(headers)
.build()?;
// Process each Git URL
for git_url in &config.git_urls {
info!("Processing: {}", git_url);
// Process repositories from the static list.
if let Some(repos) = &config.repos {
for repo_config in repos {
process_repo(
&repo_config.url,
repo_config.rename.as_deref(),
user_id,
&http_client,
&config,
args.dry_run,
)
.await?;
}
}
let repo_name = extract_repo_name(git_url);
let is_mirrored =
check_if_mirrored(&client, &config.gitea_url, git_url, &repo_name).await?;
if is_mirrored {
info!("✓ Already mirrored: {}", repo_name);
} else {
warn!("✗ Not mirrored: {}", repo_name);
if !args.dry_run {
info!("Creating migration for: {}", repo_name);
create_migration(&client, &config.gitea_url, git_url, &repo_name).await?;
info!("✓ Migration created for: {}", repo_name);
} else {
info!("[DRY RUN] Would create migration for: {}", repo_name);
// Process repositories from the organizations/users list.
if let Some(org_configs) = &config.organizations {
for org_config in org_configs {
info!(
"Fetching repositories from organization: {}",
org_config.url
);
match fetch_org_repos(&http_client, &org_config.url, org_config.api_key.as_deref())
.await
{
Ok(repo_urls) => {
info!(
"Found {} repositories for {}",
repo_urls.len(),
org_config.url
);
for url in repo_urls {
process_repo(
&url,
None, // No rename support for orgs
user_id,
&http_client,
&config,
args.dry_run,
)
.await?;
}
}
Err(e) => error!("Failed to fetch repos from {}: {}", org_config.url, e),
}
}
}
info!("Gitea mirror sync complete");
info!("Gitea mirror process completed.");
Ok(())
}
fn extract_repo_name(git_url: &str) -> String {
let url = git_url.trim_end_matches(".git");
url.split('/').last().unwrap_or("unknown").to_string()
/// Loads and parses the TOML configuration file.
#[instrument(skip(path))]
fn load_config(path: &Path) -> Result<Config, Box<dyn std::error::Error>> {
info!("Loading configuration from: {:?}", path);
let content = fs::read_to_string(path)?;
let config: Config = toml::from_str(&content)?;
Ok(config)
}
async fn check_if_mirrored(
client: &reqwest::Client,
/// Fetches the authenticated user's ID from Gitea.
#[instrument(skip(http_client, gitea_url, api_key))]
async fn get_gitea_user_id(
http_client: &reqwest::Client,
gitea_url: &str,
git_url: &str,
repo_name: &str,
) -> Result<bool, Box<dyn std::error::Error>> {
// Search for repositories by name
let search_url = format!("{}/api/v1/repos/search", gitea_url);
let response = client
.get(&search_url)
.query(&[("q", repo_name), ("limit", "50")])
api_key: &str,
) -> Result<i64, reqwest::Error> {
let url = format!("{}/api/v1/user", gitea_url);
let user: GiteaUser = http_client
.get(&url)
.header("Authorization", format!("token {}", api_key))
.send()
.await?
.error_for_status()?
.json()
.await?;
Ok(user.id)
}
/// Checks if a repository already exists in Gitea for the user.
#[instrument(skip(http_client, gitea_url, api_key))]
async fn repo_exists(
http_client: &reqwest::Client,
gitea_url: &str,
api_key: &str,
repo_name: &str,
) -> Result<bool, reqwest::Error> {
let url = format!("{}/api/v1/repos/search", gitea_url);
let response: serde_json::Value = http_client
.get(&url)
.query(&[("q", repo_name), ("limit", "1")])
.header("Authorization", format!("token {}", api_key))
.send()
.await?
.error_for_status()?
.json()
.await?;
if !response.status().is_success() {
error!("Failed to search repos: {}", response.status());
return Ok(false);
}
let search_result: serde_json::Value = response.json().await?;
if let Some(data) = search_result.get("data").and_then(|d| d.as_array()) {
for repo_json in data {
if let Ok(repo) = serde_json::from_value::<Repository>(repo_json.clone()) {
debug!("Found repo: {} (mirror: {})", repo.name, repo.mirror);
// Check if this is a mirror and matches our URL
if repo.mirror {
if let Some(original) = &repo.original_url {
// Normalize URLs for comparison
let normalized_original = normalize_git_url(original);
let normalized_target = normalize_git_url(git_url);
if normalized_original == normalized_target {
return Ok(true);
}
}
if let Some(data) = response.get("data").and_then(|d| d.as_array()) {
for repo in data {
if let Some(name) = repo.get("name").and_then(|n| n.as_str()) {
if name.eq_ignore_ascii_case(repo_name) {
return Ok(true);
}
}
}
@@ -142,52 +196,132 @@ async fn check_if_mirrored(
Ok(false)
}
fn normalize_git_url(url: &str) -> String {
let mut normalized = url.to_lowercase();
// Remove trailing .git
if normalized.ends_with(".git") {
normalized = normalized[..normalized.len() - 4].to_string();
}
// Convert git@ to https://
if normalized.starts_with("git@") {
normalized = normalized.replace("git@", "https://").replace(":", "/");
}
// Remove protocol variations
normalized = normalized
.replace("https://", "")
.replace("http://", "")
.replace("git://", "");
normalized
}
/// Creates a mirror migration in Gitea.
#[instrument(skip(http_client, config, payload))]
async fn create_migration(
client: &reqwest::Client,
gitea_url: &str,
git_url: &str,
repo_name: &str,
) -> Result<(), Box<dyn std::error::Error>> {
let migrate_url = format!("{}/api/v1/repos/migrate", gitea_url);
let request = MigrateRepoRequest {
clone_addr: git_url.to_string(),
repo_name: repo_name.to_string(),
mirror: true,
private: false,
description: format!("Mirror of {}", git_url),
};
let response = client.post(&migrate_url).json(&request).send().await?;
if !response.status().is_success() {
let status = response.status();
let error_text = response.text().await?;
error!("Failed to create migration: {} - {}", status, error_text);
return Err(format!("Migration failed: {}", status).into());
}
http_client: &reqwest::Client,
config: &Config,
payload: &MigrateRepoPayload<'_>,
) -> Result<(), reqwest::Error> {
let url = format!("{}/api/v1/repos/migrate", config.gitea_url);
http_client
.post(&url)
.header("Authorization", format!("token {}", config.api_key))
.json(payload)
.send()
.await?
.error_for_status()?;
Ok(())
}
/// Fetches all repository clone URLs from a given Gitea/GitHub organization/user page.
#[instrument(skip(http_client, api_key))]
async fn fetch_org_repos(
http_client: &reqwest::Client,
org_url: &str,
api_key: Option<&str>,
) -> Result<Vec<String>, Box<dyn std::error::Error>> {
// This is a simplified fetcher. It assumes Gitea API compatibility.
// For GitHub, you might need a different base URL and auth method.
let api_url = if org_url.contains("github.com") {
let parts: Vec<&str> = org_url.trim_end_matches('/').split('/').collect();
let user_or_org = parts.last().ok_or("Invalid GitHub URL")?;
format!("https://api.github.com/users/{}/repos", user_or_org)
} else {
// Assuming Gitea-like URL structure
let parts: Vec<&str> = org_url.trim_end_matches('/').split('/').collect();
let user_or_org = parts.last().ok_or("Invalid Gitea URL")?;
format!(
"{}s/{}/repos",
org_url.replace(user_or_org, &format!("api/v1/user")),
user_or_org
)
};
info!("Querying API endpoint: {}", api_url);
let mut repos: Vec<String> = Vec::new();
let mut page = 1;
loop {
let mut request_builder = http_client
.get(&api_url)
.query(&[("page", page.to_string())])
// For GitHub, a User-Agent is required.
.header("User-Agent", "gitea-mirror-rust-client");
if let Some(key) = api_key {
request_builder = request_builder.header("Authorization", format!("token {}", key));
}
let response: Vec<serde_json::Value> = request_builder
.send()
.await?
.error_for_status()?
.json()
.await?;
if response.is_empty() {
break; // No more pages
}
for repo in response {
if let Some(clone_url) = repo.get("clone_url").and_then(|u| u.as_str()) {
repos.push(clone_url.to_string());
}
}
page += 1;
}
Ok(repos)
}
/// Core logic to process a single repository.
#[instrument(skip(user_id, http_client, config, dry_run))]
async fn process_repo(
repo_url: &str,
rename: Option<&str>,
user_id: i64,
http_client: &reqwest::Client,
config: &Config,
dry_run: bool,
) -> Result<(), Box<dyn std::error::Error>> {
let repo_name = match rename {
Some(name) => name,
None => extract_repo_name(repo_url).ok_or("Could not extract repo name from URL")?,
};
info!("Processing repo '{}' -> '{}'", repo_url, repo_name);
if repo_exists(http_client, &config.gitea_url, &config.api_key, repo_name).await? {
info!("Repo '{}' already exists. Skipping.", repo_name);
} else {
warn!("Repo '{}' does not exist. Migration needed.", repo_name);
if !dry_run {
info!("Initiating migration for '{}'...", repo_name);
let payload = MigrateRepoPayload {
clone_addr: repo_url,
repo_name,
mirror: true,
private: false, // Defaulting to public, change if needed
description: "",
uid: user_id,
};
if let Err(e) = create_migration(http_client, config, &payload).await {
error!("Failed to create migration for '{}': {}", repo_name, e);
} else {
info!("Successfully started migration for '{}'.", repo_name);
}
} else {
info!(
"Dry run enabled. Skipping actual migration for '{}'.",
repo_name
);
}
}
Ok(())
}
/// Extracts a repository name from a git URL (e.g., "https://.../repo.git" -> "repo").
fn extract_repo_name(url: &str) -> Option<&str> {
url.split('/').last().map(|s| s.trim_end_matches(".git"))
}