Well, I guess it couldn't hurt to break it down some:
Talent:
Some folks have it, and are naturally good voice actors with no training. But there is a knack to make the voice sound interesting enough to catch the listener's attention, but also to be able to convey a sense of character and purpose in addition to the information as dictated by the script. There any number of ways to deliver a line, and the entire meaning of the same line can change depending on which word(s) are stressed above the others. To be convincing, conveying an emotional element is key, and it is normally difficult to express an emotion you aren't actually feeling. The big five most difficult emotions are anger, love, joy, sorrow, fear. Those are very difficult to convince others you are feeling those emotions when you actually aren't. However, in my experience, most people are usually naturally good at pretending one of the five - some folks can cry on cue, others are totally believable when they laugh (even if it isn't funny), my particular strength is numerous wasy to express anger when I'm not at all angry. However, it is not easy to be good at all of them. The more difficult a particular line is, the more recording "takes" are required in order to get it just right. Having a really good director on hand to give feedback on the recording session can help, or lacking that, someone who has some decent judgement and can at least give honest "that works/doesn't work" feedback. It's not easy, and it requires patience, but if you enjoy doing it, you should do it.
Recording environment:
This factor is usually overlooked, and can be devastating combined with a low-end microphone, and maybe doubly so if you have a really good microphone, because it is likely to pick up everything - the sound of the refrigerator/air conditioner/tv/fan, or even the hum of say a fish tank motor, or even the hum of the computer itself. You want a nice quiet space, with minimal echo, windows closed, and any unnecessary electronics turned off. Living near an airport is the worst possible curse you could wish on a sound man. If you can afford renting a professional recording studio, I highly recommend it, as they will have all the gear and a technition usually comes with the rental fee.
Headphones:
The better the headphones, the better your ability to pick out background noise and other distortions, as well as the quality of the recording.
The Microphone:
Of all the pieces of hardware, it is the most important. It doesn't matter how good the rest of your gear is, a bad mic is going to pass on bad sound. Good microphones for voice overs run $100 - $200 US dollars. Professional studios and singers can easily spend several thousand dollars on a single microphone. USB microphones are pretty much only designed to capture the voice well enough so that the other person can understand what you are saying, but don't count on them delivering a consistent crisp clean sound. Also, just because a microphone costs a lot of money doesn't mean it's actually any good. Do the research, and check out websites that target professional singers / musicians / voice over artists to see what their recommendations are.
Pop screens:
A pop screen is an attachment that is between the microphone and the mouth, and helps soften pops from hard "P"s as well as hard "S"s and "T"s. Without it, you could get a spike in the recording that goes beyond what the microphone/hardware/software can handle and you end you with crackle and distortion. You can create your own home made pop screen with some coat hanger wire and a cut up thin dress sock or panty hose, or you can just put the sock/hose over the microphone itself.
Hardware:
Most computers have built in high-def audio cards in the mother boards, and are suitable enough for voice recordings. If you want to capture musing/singing, it's worth investing in a higher end audio card.
Recording Software:
Audacity is a fairly good piece of software for capturing the recording (especially considering it is free/shareware). For more advanced recording options that allow you to use limiters and other features to control the initial quality of the recording, higher end software such as Pro-tools or Sonar are helpful for getting a clean sound off the bat.
Post-recording Software:
To add echo, pitch shifting, and use tools to edit/splice sound clips together, including blending, volume control, cleaning out pops and clicks, and eliminating cycle hum and background "white noise", the higher end software is usually more suited to the task.
It's definitely time consuming, and requires patience. Recording 100 words of dialogue and getting them "just right" could easily take a half hour to an hour, depending on how picky you are about the result. I am very self-conscious about my work, so it takes me forever, so I rarely do it unless I am passionate about it.
But then, that's what it comes down to. Regardless of what gear you have or your experience, if you are passionate about it, just do it!