URLs Detector - ApyHub
SharpAPI
SharpAPI
verified icon
1000 atoms
Base tier

About

The URLs Detector API offers a powerful tool for identifying, extracting, and validating web links in large datasets, with added detection for broken URLs. It streamlines content management, data validation, and compliance by filtering out invalid or unwanted URLs.
Ideal for developers in content moderation, data cleaning, and compliance, the API automates URL validation to ensure consistency and accuracy. Use cases include cleaning up datasets for valid URLs, identifying broken links in content repositories, and enforcing policies around unauthorized URLs.

AI jobs involve two key steps:

  1. Submitting the AI job: Initiating the process by sending the job request.
  2. Monitoring and receiving results: Continuously checking the job status and obtaining the final output upon successful completion.
Apy Jobs are long running calls which are split into two actions:
1. Submitting the job
2. Checking the status of the job and receiving the response on successful completion of the job.
Select API Endpoints
Input(s)

API Playground

API Documentation

URLs Detector Submit Job
POST
https://api.apyhub.com/sharpapi/api/v1/content/detect_urls

Request example

1
curl --location --request POST 'https://api.apyhub.com/sharpapi/api/v1/content/detect_urls' \
2
--header 'apy-token: {{token}}' \
3
--header 'Content-Type: application/json' \
4
--data-raw '{
5
"content": "HTTP (Hypertext Transfer Protocol) - Used for standard, unencrypted web browsing (e.g., http://example.com). HTTPS (Hypertext Transfer Protocol Secure) - Encrypted version of HTTP, providing secure communication (e.g., https://example.com).FTP (File Transfer Protocol) - Used for file transfers between computers on a network (e.g., ftp://example.com).SFTP (SSH File Transfer Protocol) - A secure version of FTP that runs over SSH (e.g., sftp://example.com).FTPS (FTP Secure) - FTP with SSL/TLS encryption for secure file transfers (e.g., ftps://example.com).MAILTO - Opens a default email client to send an email (e.g., mailto:someone@example.com).TEL - Used for initiating phone calls on devices with call capability (e.g., tel:+1234567890).DATA - Embeds small files inline within the URL, often used for images or other data (e.g., data:image/png;base64,...).FILE - Accesses local files on a user'\''s computer (e.g., file:///C:/path/to/file).WS (WebSocket) - Establishes WebSocket connections for real-time data exchange (e.g., ws://example.com). WSS (WebSocket Secure) - Secure version of WebSocket for encrypted communication (e.g., wss://example.com).GOPHER - Used by the Gopher protocol, an early internet search and retrieval protocol (e.g., gopher://example.com).IRC (Internet Relay Chat) - Used to connect to IRC servers for chatting (e.g., irc://irc.example.com).IRCS (Secure IRC) - Secure version of IRC with SSL encryption (e.g., ircs://irc.example.com).MAGNET - Links to files using peer-to-peer file sharing, often with torrents (e.g., magnet:?xt=urn:btih:...).SMB (Server Message Block) - Accesses shared files on a network (e.g., smb://example.com/share).NFS (Network File System) - Similar to SMB, for accessing files on network storage (e.g., nfs://example.com/path).SSH (Secure Shell) - Used for connecting to SSH servers, usually for secure command-line access (e.g., ssh://user@example.com).RTSP (Real-Time Streaming Protocol) - Used for streaming media (e.g., rtsp://example.com).RTP (Real-Time Transport Protocol) - Another protocol for real-time audio/video streaming (often paired with RTSP).VPN (Virtual Private Network) - Commonly proprietary, but sometimes URLs for VPN connections are in formats like vpn://example.com.ABOUT - A protocol for displaying internal browser information (e.g., about:blank).CHROME - A protocol for Chrome’s internal settings and pages (e.g., chrome://settings).EDGE - Similar to Chrome, for Microsoft Edge settings and pages (e.g., edge://settings).JAVASCRIPT - Used to run inline JavaScript code in browsers (e.g., javascript:alert('\''Hello'\'');).LDAP (Lightweight Directory Access Protocol) - Used to access directory services, like authentication systems (e.g., ldap://example.com).LDAPS (Secure LDAP) - Secure version of LDAP, usually with SSL/TLS encryption (e.g., ldaps://example.com).URN (Uniform Resource Name) - Identifies resources by name rather than location (e.g., urn:isbn:0451450523).GEMINI - Used for the Gemini protocol, a lightweight alternative to HTTP (e.g., gemini://example.com).SIP (Session Initiation Protocol) - Used for initiating communication sessions, often in VoIP (e.g., sip:user@example.com).SIPS (Secure SIP) - Secure version of SIP for encrypted VoIP calls (e.g., sips:user@example.com).CALLTO - Similar to tel:, used for initiating voice or video calls (e.g., callto://+1234567890).WEB+ - Custom protocol prefix for web applications to create custom handlers (e.g., web+myapp://example).MS-WORD / MS-EXCEL / MS-POWERPOINT - Protocols specific to Microsoft Office applications, often used to open Office documents directly (e.g., ms-word:ofe|u|https://example.com/doc.docx).ZOOM - Used by the Zoom app to initiate video meetings (e.g., zoommtg://zoom.us/join?confno=123456789).TEAMS - Used by Microsoft Teams to initiate calls or meetings (e.g., msteams://example.com).SLACK - Used by Slack to open channels, DMs, or apps (e.g., slack://open?team=example).ITMS / ITMS-APPSTORE - Links to Apple’s iTunes or App Store (e.g., itms://itunes.apple.com/...).MAPS - Used to open map locations in mapping applications (e.g., maps://?q=1600+Amphitheatre+Parkway).MS-WEB - Used by Microsoft for some web-based services (e.g., ms-web://example.com).NEWS - Used for Usenet news groups (e.g., news:comp.lang.python).NNTP (Network News Transfer Protocol) - Accesses Usenet articles directly (e.g., nntp://news.example.com).ANDROID - Custom protocol for Android apps (e.g., intent://scan/#Intent;...).INTENT - Another Android protocol for app deep links (e.g., intent://example.com#Intent;...).PAYPAL - Opens links directly to PayPal actions (e.g., paypal://).VNC (Virtual Network Computing) - Used for remote desktop connections (e.g., vnc://example.com).TOR - Used to connect via the Tor network (e.g., tor://example.onion).ED2K - Protocol used by eDonkey and eMule file-sharing networks (e.g., ed2k://|file|...).SECOND-LIFE - Used by the Second Life virtual world to launch specific locations or interactions (e.g., secondlife://Region/128/128/0).AAAS (AAA Secure) - Protocol for authentication, authorization, and accounting in network services."
6
}'
Detect URLs
Method: POST
Content Type: application/json
Request Body
AttributeTypeMandatoryDescription
contentstringYesThe content from which URLs are to be detected.
Sample Response
1
{
2
"status_url": "https://api.apyhub.com/sharpapi/api/v1/content/content_detect_urls/job/status/5de4887a-0dfd-49b6-8edb-9280e468c210",
3
"job_id": "5de4887a-0dfd-49b6-8edb-9280e468c210"
4
}

HTTP Response Codes

The method may return one of the following HTTP status codes:
Status CodeDescription
202The job was submitted successfully.
401Required authentication information is either missing or not valid for the resource.
400Invalid input - if the file is invalid or corrupted.
500If any unexpected error occurs while submitting the request.

Authentication

All API requests to ApyHub services need to be authenticated. Currently we support tokens or basic authentication mechanisms. You can generate and view your existing credentials from your workspace settings (on the left side of the navbar) and go to “API Keys".
Points to note:
  • Credential secrets are generated on the fly and are not stored in plain text, so on generating a credential please save the secrets somewhere safe.
  • Use the apy-token as the header parameter to pass the token.
  • Use the Authorization header to send the basic authentication credentials.

Error codes

1
{
2
"error": {
3
"code": 105,
4
"message": "Invalid URL"
5
}
6
}
To search for a specific error code, enter the code in the search box below. Alternatively, you can click on the button to view a complete list of all error codes.
Table of contents
AboutAPI PlaygroundAPI DocumentationAuthenticationError codesRelated Utility APIsRelated Articles