One way to do this would be to collect the job failure by trapping it in a on failure step, use that step to insert into a table, and have another job check in that table to see if you have any new records, if the job finds a record with a flag = 0 than send an email to the operator, have the job run every n minutes and send the mail. When the operator finally responds, have some mechanism to update the flag in the alert table. You then have a history of alerts and response times.
Andrew