Condor Service FAQ

Answers to your most common questions about Condor Service.

Quick, simple, and helpful information at a glance.

What is Condor?
Condor is a software system used for managing large complex computing tasks.
How do I install Condor?
Installation instructions can be found on the Condor website: https://research.cs.wisc.edu/htcondor/.
Why is Condor not starting up?
There could be a number of reasons for this issue, including incorrect settings or other technical issues. Check the log files for any error messages and refer to the Condor manual for troubleshooting steps.
How can I submit a job to Condor?
Job submission depends on your specific use case and Condor configuration. Refer to the Condor manual for more details.
Why is my Condor job getting held?
This could be due to a number of reasons such as missing input files or unavailable resources. Check the job status in the Condor scheduler and refer to the Condor manual for troubleshooting steps.
How do I change the Condor configuration?
The Condor configuration is stored in a file called "condor_config" which can be edited to make changes. Refer to the Condor manual for guidance on specific configuration settings.
Why am I getting a "file not found" error when running Condor?
This could be due to incorrect file paths in your Condor job submission or configuration. Double check all file paths and refer to the Condor manual for more details.
What is a Condor pool?
A Condor pool is a collection of computers, both local and remote, that are managed by a single Condor installation.
How do I add more resources to my Condor pool?
This will depend on your specific Condor configuration. Refer to the Condor manual for guidance on how to add new resources to your pool.
What is a Condor checkpoint?
A Condor checkpoint is a snapshot of a jobs current state, allowing it to resume from the same point in case of interruption.
How do I troubleshoot Condor checkpoints?
Checkpoints can fail due to a variety of reasons such as insufficient resources or network issues. Refer to the Condor manual for troubleshooting steps.
Why is my Condor job failing with an "out of memory" error?
This could be due to insufficient resources being allocated to the job. Check the Condor configuration and refer to the Condor manual for guidance on adjusting resource settings.
How can I monitor the status of my Condor jobs?
There are various tools and commands in Condor to monitor job status such as "condor_q" and "condor_status". Refer to the Condor manual for more details.
Why am I getting authentication errors in Condor?
This could be due to incorrect security settings in Condor or issues with the user's credentials. Refer to the Condor manual for troubleshooting steps.
How do I remove a job from the Condor queue?
You can use the "condor_rm" command to remove a specific job or the "condor_rm -a" command to remove all jobs from your queue.
Why is my Condor job stuck in the idle state?
This could be due to a variety of reasons such as missing required input files, incorrect settings, or unavailable resources. Check the job status and refer to the Condor manual for troubleshooting steps.
How do I transfer files between my local machine and a remote Condor job?
Condor has built-in file transfer capabilities, which can be configured in the job submission file. Refer to the Condor manual for more details.
Why is my Condor job failing with a "permission denied" error?
This could be due to incorrect file permissions or security settings in Condor. Refer to the Condor manual or consult your system administrator for assistance.
How do I troubleshoot Condor networking issues?
Networking issues can cause jobs to fail or become stuck in the queue. Refer to the Condor manual for troubleshooting steps and consult your network administrator if needed.
What is a Condor job group?
A Condor job group contains a set of jobs that are submitted as a unit and can share resources and constraints.
How do I submit a job group in Condor?
Job groups can be submitted using the "condor_submit" command with the "-queue" option. Refer to the Condor manual for more details.
Can I prioritize certain jobs in my Condor pool?
Yes, you can use Condor's priority mechanism to specify which jobs should receive resources first. Refer to the Condor manual for more details.
Why is my Condor job getting preempted?
Preemption can occur when higher priority jobs become available or when resources are needed for other jobs. Refer to the Condor manual for more details on how to prioritize jobs.
How do I scale up my Condor pool for larger jobs?
You can add more resources to your Condor pool or use the job group feature to split a single job into smaller parts. Refer to the Condor manual for more details.
Why am I getting a "job aborted" error in Condor?
This could be due to a variety of reasons such as invalid configuration settings, unavailable resources, or issues with the Condor installation. Refer to the Condor manual for troubleshooting steps.
How do I remove a resource from my Condor pool?
You can use the "condor_rm -machine" command to remove a specific resource from the pool. Refer to the Condor manual for more details.
Why is Condor not running any jobs?
This could be due to a variety of reasons such as incorrect configuration settings, resource availability, or network issues. Check the Condor log files and refer to the Condor manual for troubleshooting steps.
How do I set up Condor to automatically restart failed jobs?
This can be configured in the Condor job submission file using the "retry" and "retries" options. Refer to the Condor manual for more details.