Re-scrape a website source
POST/bv/aisk/v1/sources/:id:re-scrape
This API is used to re-scrape a website source resource.
It'll return a 400
/BadRequest
error if the following conditions meet:
- Return with an
ERROR_REASON_SOURCE_REQUIRES_A_WEBSITE_TYPE
if the source type is not aSOURCE_TYPE_WEBSITE
. - Return with an
ERROR_REASON_SOURCE_REQUIRES_IN_READY_STATUS
if the source status is not inSOURCE_STATUS_READY
.
It'll return a 404
/NotFound
error if any requested resource is not found.
Request
Path Parameters
Required. The uuid of the source id.
The type of source must be the SOURCE_TYPE_WEBSITE
.
- application/json
Body
required
object
Responses
- 200
- 400
- 401
- 403
- 500
- default
A successful response.
- application/json
- Schema
- Example (from schema)
Schema
- Array [
- ]
- Array [
- ]
- Array [
- ]
- Array [
- ]
- Array [
- ]
- Array [
- ]
- Array [
- ]
- Array [
- ]
source object
Output only. The uuid of the source.
Possible values: [SOURCE_TYPE_WEBSITE
, SOURCE_TYPE_LOCAL_DRIVE
, SOURCE_TYPE_VOD
, SOURCE_TYPE_VIDEO
, SOURCE_TYPE_SUBTITLE
, SOURCE_TYPE_TEXT
, SOURCE_TYPE_AOD
]
Required. Immutable. The type of the source.
Required. The name of the source.
Possible values: [SOURCE_STATUS_INGESTING
, SOURCE_STATUS_PREPARING
, SOURCE_STATUS_READY
, SOURCE_STATUS_UPDATING
, SOURCE_STATUS_FAILED
, SOURCE_STATUS_DELETING
]
Output only. The status of the source.
Possible values: [SOURCE_FORMAT_WEBSITE
, SOURCE_FORMAT_PDF
, SOURCE_FORMAT_DOC
, SOURCE_FORMAT_DOCX
, SOURCE_FORMAT_VOD_TO_TEXT
, SOURCE_FORMAT_VIDEO_TO_TEXT
, SOURCE_FORMAT_SRT
, SOURCE_FORMAT_VTT
, SOURCE_FORMAT_TEXT
, SOURCE_FORMAT_AOD_TO_TEXT
]
Output only.
Output only. The size of the source in bytes.
file object
Output only. It'll be available if type
is SOURCE_TYPE_LOCAL_DRIVE
or SOURCE_TYPE_SUBTITLE
.
Output only. The uuid of the file.
Possible values: [FILE_TYPE_VIDEO
, FILE_TYPE_IMAGE
, FILE_TYPE_SUBTITLE
, FILE_TYPE_DOCUMENT
, FILE_TYPE_WEB_LINK
, FILE_TYPE_AUDIO
]
Output only. The type of the file.
Output only. The name of the file.
Output only. The size of the file in bytes.
vod object
Output only. It'll be available if type
is SOURCE_TYPE_VOD
.
Output only. The uuid of the vod.
video object
Output only. It'll be available if type
is SOURCE_TYPE_VIDEO
.
Output only. The uuid of the video.
text object
Output only. It'll be available if type
is SOURCE_TYPE_TEXT
.
aod object
Output only. It'll be available if type
is SOURCE_TYPE_AOD
.
Output only. The uuid of the aod.
metadata object
Optional. The metadata of the source.
keyword_1 object
data object[]
keyword_2 object
data object[]
keyword_3 object
data object[]
integer_range_1 object
data object[]
integer_range_2 object
data object[]
integer_range_3 object
data object[]
boolean_1 object
data object
boolean_2 object
data object
boolean_3 object
data object
text_1 object
data object
text_2 object
data object
text_3 object
data object
Output only. The time that the source will be updated in the next period.
Output only. The number of characters of this source used.
summary object
Output only. The summary of this source.
The content of summary.
Possible values: [SUMMARY_STATUS_WAITING
, SUMMARY_STATUS_PROCESSING
, SUMMARY_STATUS_READY
, SUMMARY_STATUS_FAILED
]
The status of summary.
error_infos object[]
Output only. Optional. A list of messages that carry the error infos when vod encoding is failed.
The reason of the error. This is a constant value that identifies the
proximate cause of the error. Error reasons are unique within a particular
domain of errors. This should be at most 63 characters and match a
regular expression of [A-Z][A-Z0-9_]+[A-Z0-9]
, which represents
UPPER_SNAKE_CASE.
The logical grouping to which the "reason" belongs. The error domain is typically the registered service name of the tool or product that generates the error. Example: "pubsub.googleapis.com". If the error is generated by some common infrastructure, the error domain must be a globally unique value that identifies the infrastructure. For Google API infrastructure, the error domain is "googleapis.com".
metadata object
Additional structured details about this error.
Keys should match /[a-zA-Z0-9-_]/ and be limited to 64 characters in length. When identifying the current value of an exceeded limit, the units should be contained in the key, not the value. For example, rather than {"instanceLimit": "100/request"}, should be returned as, {"instanceLimitPerRequest": "100"}, if the client exceeds the number of instances that can be created in a single (batch) request.
Output only. The time that the source created.
Output only. The time that the source last updated.
error_infos object[]
Output only. A list of messages that carry the error infos when source is failed.
The reason of the error. This is a constant value that identifies the
proximate cause of the error. Error reasons are unique within a particular
domain of errors. This should be at most 63 characters and match a
regular expression of [A-Z][A-Z0-9_]+[A-Z0-9]
, which represents
UPPER_SNAKE_CASE.
The logical grouping to which the "reason" belongs. The error domain is typically the registered service name of the tool or product that generates the error. Example: "pubsub.googleapis.com". If the error is generated by some common infrastructure, the error domain must be a globally unique value that identifies the infrastructure. For Google API infrastructure, the error domain is "googleapis.com".
metadata object
Additional structured details about this error.
Keys should match /[a-zA-Z0-9-_]/ and be limited to 64 characters in length. When identifying the current value of an exceeded limit, the units should be contained in the key, not the value. For example, rather than {"instanceLimit": "100/request"}, should be returned as, {"instanceLimitPerRequest": "100"}, if the client exceeds the number of instances that can be created in a single (batch) request.
{
"source": {
"id": "string",
"type": "SOURCE_TYPE_WEBSITE",
"name": "string",
"status": "SOURCE_STATUS_INGESTING",
"format": "SOURCE_FORMAT_WEBSITE",
"size_in_bytes": "string",
"file": {
"id": "string",
"type": "FILE_TYPE_VIDEO",
"name": "string",
"size_in_bytes": "string"
},
"vod": {
"id": "string"
},
"video": {
"id": "string"
},
"text": {
"content": "string"
},
"aod": {
"id": "string"
},
"metadata": {
"keyword_1": {
"data": [
{
"value": "string"
}
]
},
"keyword_2": {
"data": [
{
"value": "string"
}
]
},
"keyword_3": {
"data": [
{
"value": "string"
}
]
},
"integer_range_1": {
"data": [
{
"gte": "string",
"lte": "string"
}
]
},
"integer_range_2": {
"data": [
{
"gte": "string",
"lte": "string"
}
]
},
"integer_range_3": {
"data": [
{
"gte": "string",
"lte": "string"
}
]
},
"boolean_1": {
"data": {
"value": true
}
},
"boolean_2": {
"data": {
"value": true
}
},
"boolean_3": {
"data": {
"value": true
}
},
"text_1": {
"data": {
"value": "string"
}
},
"text_2": {
"data": {
"value": "string"
}
},
"text_3": {
"data": {
"value": "string"
}
}
},
"next_update_time": "2024-07-29T15:51:28.071Z",
"character_count": "string",
"summary": {
"content": "string",
"status": "SUMMARY_STATUS_WAITING",
"error_infos": [
{
"reason": "string",
"domain": "string",
"metadata": {}
}
]
},
"created_at": "2024-07-29T15:51:28.071Z",
"updated_at": "2024-07-29T15:51:28.071Z",
"error_infos": [
{
"reason": "string",
"domain": "string",
"metadata": {}
}
]
}
}
A bad request response.
The code
is 3
means got an invalid argument. There are more HTTP status code mappings listed on here and gRPC code on here.
- application/json
- Schema
- Example (from schema)
Schema
- Array [
- If no scheme is provided,
https
is assumed. - An HTTP GET on the URL must yield a [google.protobuf.Type][] value in binary format, or produce an error.
- Applications are allowed to cache lookup results based on the URL, or have them precompiled into a binary to avoid any lookup. Therefore, binary compatibility needs to be preserved on changes to types. (Use versioned type names to manage breaking changes.)
- ]
details object[]
A URL/resource name that uniquely identifies the type of the serialized
protocol buffer message. This string must contain at least
one "/" character. The last segment of the URL's path must represent
the fully qualified name of the type (as in
path/google.protobuf.Duration
). The name should be in a canonical form
(e.g., leading "." is not accepted).
In practice, teams usually precompile into the binary all types that they
expect it to use in the context of Any. However, for URLs which use the
scheme http
, https
, or no scheme, one can optionally set up a type
server that maps type URLs to message definitions as follows:
Note: this functionality is not currently available in the official protobuf release, and it is not used for type URLs beginning with type.googleapis.com. As of May 2023, there are no widely used type server implementations and no plans to implement one.
Schemes other than http
, https
(or the empty scheme) might be
used with implementation specific semantics.
{
"code": 0,
"message": "string",
"details": [
{
"@type": "string"
}
]
}
A unauthenticated response.
The header authorization
was missing or unidentified.
- application/json
- Schema
- Example (from schema)
Schema
- Array [
- If no scheme is provided,
https
is assumed. - An HTTP GET on the URL must yield a [google.protobuf.Type][] value in binary format, or produce an error.
- Applications are allowed to cache lookup results based on the URL, or have them precompiled into a binary to avoid any lookup. Therefore, binary compatibility needs to be preserved on changes to types. (Use versioned type names to manage breaking changes.)
- ]
details object[]
A URL/resource name that uniquely identifies the type of the serialized
protocol buffer message. This string must contain at least
one "/" character. The last segment of the URL's path must represent
the fully qualified name of the type (as in
path/google.protobuf.Duration
). The name should be in a canonical form
(e.g., leading "." is not accepted).
In practice, teams usually precompile into the binary all types that they
expect it to use in the context of Any. However, for URLs which use the
scheme http
, https
, or no scheme, one can optionally set up a type
server that maps type URLs to message definitions as follows:
Note: this functionality is not currently available in the official protobuf release, and it is not used for type URLs beginning with type.googleapis.com. As of May 2023, there are no widely used type server implementations and no plans to implement one.
Schemes other than http
, https
(or the empty scheme) might be
used with implementation specific semantics.
{
"code": 0,
"message": "string",
"details": [
{
"@type": "string"
}
]
}
A forbidden response.
It means that the provided authorization
did not have enough permission to access the resource or the API.
- application/json
- Schema
- Example (from schema)
Schema
- Array [
- If no scheme is provided,
https
is assumed. - An HTTP GET on the URL must yield a [google.protobuf.Type][] value in binary format, or produce an error.
- Applications are allowed to cache lookup results based on the URL, or have them precompiled into a binary to avoid any lookup. Therefore, binary compatibility needs to be preserved on changes to types. (Use versioned type names to manage breaking changes.)
- ]
details object[]
A URL/resource name that uniquely identifies the type of the serialized
protocol buffer message. This string must contain at least
one "/" character. The last segment of the URL's path must represent
the fully qualified name of the type (as in
path/google.protobuf.Duration
). The name should be in a canonical form
(e.g., leading "." is not accepted).
In practice, teams usually precompile into the binary all types that they
expect it to use in the context of Any. However, for URLs which use the
scheme http
, https
, or no scheme, one can optionally set up a type
server that maps type URLs to message definitions as follows:
Note: this functionality is not currently available in the official protobuf release, and it is not used for type URLs beginning with type.googleapis.com. As of May 2023, there are no widely used type server implementations and no plans to implement one.
Schemes other than http
, https
(or the empty scheme) might be
used with implementation specific semantics.
{
"code": 0,
"message": "string",
"details": [
{
"@type": "string"
}
]
}
A server error response. There are more HTTP status code mappings listed on here.
- application/json
- Schema
- Example (from schema)
Schema
- Array [
- If no scheme is provided,
https
is assumed. - An HTTP GET on the URL must yield a [google.protobuf.Type][] value in binary format, or produce an error.
- Applications are allowed to cache lookup results based on the URL, or have them precompiled into a binary to avoid any lookup. Therefore, binary compatibility needs to be preserved on changes to types. (Use versioned type names to manage breaking changes.)
- ]
details object[]
A URL/resource name that uniquely identifies the type of the serialized
protocol buffer message. This string must contain at least
one "/" character. The last segment of the URL's path must represent
the fully qualified name of the type (as in
path/google.protobuf.Duration
). The name should be in a canonical form
(e.g., leading "." is not accepted).
In practice, teams usually precompile into the binary all types that they
expect it to use in the context of Any. However, for URLs which use the
scheme http
, https
, or no scheme, one can optionally set up a type
server that maps type URLs to message definitions as follows:
Note: this functionality is not currently available in the official protobuf release, and it is not used for type URLs beginning with type.googleapis.com. As of May 2023, there are no widely used type server implementations and no plans to implement one.
Schemes other than http
, https
(or the empty scheme) might be
used with implementation specific semantics.
{
"code": 0,
"message": "string",
"details": [
{
"@type": "string"
}
]
}
An unexpected error response.
- application/json
- Schema
- Example (from schema)
Schema
- Array [
- If no scheme is provided,
https
is assumed. - An HTTP GET on the URL must yield a [google.protobuf.Type][] value in binary format, or produce an error.
- Applications are allowed to cache lookup results based on the URL, or have them precompiled into a binary to avoid any lookup. Therefore, binary compatibility needs to be preserved on changes to types. (Use versioned type names to manage breaking changes.)
- ]
details object[]
A URL/resource name that uniquely identifies the type of the serialized
protocol buffer message. This string must contain at least
one "/" character. The last segment of the URL's path must represent
the fully qualified name of the type (as in
path/google.protobuf.Duration
). The name should be in a canonical form
(e.g., leading "." is not accepted).
In practice, teams usually precompile into the binary all types that they
expect it to use in the context of Any. However, for URLs which use the
scheme http
, https
, or no scheme, one can optionally set up a type
server that maps type URLs to message definitions as follows:
Note: this functionality is not currently available in the official protobuf release, and it is not used for type URLs beginning with type.googleapis.com. As of May 2023, there are no widely used type server implementations and no plans to implement one.
Schemes other than http
, https
(or the empty scheme) might be
used with implementation specific semantics.
{
"code": 0,
"message": "string",
"details": [
{
"@type": "string"
}
]
}